Linear Regression in Various Products

Asked

Viewed 452 times

2

I ran a simple regression to a database with a product (Product, Volume, Price). It ran perfectly. But I would like to run the same regression on a basis with more products, but I want to be able to choose the product I want to run the regression, see:

ex.

Produto | Volume | Preço

A

A         

B

B

I want to run the regression only on product B.

  • How to do this?

  • How to run the regression on all products, however, return separately so that I can analyze them side by side?

Code.

import pandas as pd

Pasta1 = pd.ExcelFile ('Pasta2.xlsx')
Daniel = pd.read_excel (Pasta1, 'Tela')


from scipy.stats import linregress

x= Daniel ['Preço']
y= Daniel ['Volume'] 
m, b, R, p, SEm = linregress (x, y)

pd.DataFrame ([m , b, R, p, SEm] , columns=['Valores'] , index=['declive', 
'ordenada_na_origem', 'coeficiente_de_correlação_(de_Pearson)', 'p-value', 
'erro_padrão'])

Upshot:

Valores

declive: 421.398071 

ordenada_na_origem: 1432.443189 

coeficiente_de_correlação_(de_Pearson): 0.331966 

p-value: 0.000003 

erro_padrão: 86.869651 
  • Okay Guto... I’ll try.

2 answers

1


Given what seems to be your data, I was able to solve using the attribute .loc of the pandas dataframe.

An example of how I did:

import pandas as pd
import numpy as np

df1 = pd.DataFrame(np.random.randn(6,4),index=list('abadaf'),columns=list('ABCD'))
>>df1
          A         B         C         D
a -0.973031  0.305699  1.330237 -0.799858
b -0.879060  0.238690 -2.729635 -0.457865
a -2.001388  1.058163 -0.328737  0.134416
d  0.994644 -2.305340 -0.714434  0.298462
a -2.242108 -0.331434  0.969981  0.973202
f -0.483833  0.783812  0.925608  0.590251

>>df1.loc['a']
          A         B         C         D
a -0.973031  0.305699  1.330237 -0.799858
a -2.001388  1.058163 -0.328737  0.134416
a -2.242108 -0.331434  0.969981  0.973202

>> df1.loc['a','A']
a   -0.973031
a   -2.001388
a   -2.242108

Here the "product name" is as index. If you want to call the data based on its values (strings or numbers), you can use the .loc along with bollean expressions :

>> df1 = pd.DataFrame([['a',1,2,3],['b',2,3,4],['a',3,4,5],['c',4,5,6]],index=list('defg'),columns=list('higj'))
>> df1
   h  i  g  j
d  a  1  2  3
e  b  2  3  4
f  a  3  4  5
g  c  4  5  6

>> df1.h=='a'
d     True
e    False
f     True
g    False
Name: h, dtype: bool
>> df1.loc[ df1.h=='a',:]
   h  i  g  j
d  a  1  2  3
f  a  3  4  5
>> df1.loc[ df1.h=='a','i']
d    1
f    3
  • Guto, using . Does the search only happen on the Index? If (a,b,c,d,e,f) was in a column (as if they were products) could I, for example, plot a graph with product A only? Or, as I had put in the example, I could do a regression using the volume and price of product A?

  • Guto, thank you so much for your help. With your tip I got.

  • Right, it’s just clicking the flag?

  • A poprósito Guto, I need a way to not need to repeat each block of each product, this is possible?

  • @Danielmelo, I’ll take a look when I can, I think I can do it in a simple way, but I have to test it. BTW, note that I have removed comments that are no longer relevant. So the question is cleaner for future references.

  • Okay... I’m new around here, so I still can’t move very well... thanks for the help.

Show 1 more comment

0

With the help of Guto, I decided as follows:

import pandas as pd
import matplotlib.pyplot as plt

Pasta1 = pd.ExcelFile ('Pasta2.xlsx')
Daniel = pd.read_excel (Pasta1, 'Tela')


from scipy.stats import linregress

x= Daniel.loc [(Daniel ['Preço'] > 0) & (Daniel ['Produto'] == 'A')]
x1= x ['Preço']
y= Daniel.loc [(Daniel ['Volume'] > 0) & (Daniel ['Produto'] == 'A')]
y1= y ['Volume']
Produto_A = linregress (x1, y1)


x2= Daniel.loc [(Daniel ['Preço'] > 0) & (Daniel ['Produto'] == 'B')]
x3= x2 ['Preço']
y2= Daniel.loc [(Daniel ['Volume'] > 0) & (Daniel ['Produto'] == 'B')]
y3= y2 ['Volume']
Produto_B = linregress (x3, y3)


pd.DataFrame ([Produto_A, Produto_B] , index=['Valores', 'Valores2'])

Now I just need to find a way to run with more products, without the need to create a block for each product.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.