Select python - Pandas

Question

Select python - Pandas

Asked 3 years, 7 months ago

Viewed 116 times

-1

I am developing a python program where I import two xlsx files directly with the Pandas library, I can import them easily, but I need to create a report between these two files.

How can I do?

CSS, good morning! Can you make the xlsx available? Hugs!

– lmonferrari

2020/11/17 at 10:46
@lmonferrari Follow the example files: https://gofile.io/d/5TUFTS

– CSS

2020/11/17 at 13:13

1 answer

Browser other questions tagged python excel pandas

You are not signed in. Login or sign up in order to post.

by lmonferrari • **3,550** points · Answer 1 · 2020-11-17T13:31:39+00:00

Importing the package

import pandas as pd

Loading the files:

lojas = pd.read_excel('./lojas.xlsx')
produtos = pd.read_excel('./produtos.xlsx')

Using Jay of the Pandas:

novo_df = produtos.set_index('Código da loja').join(lojas.set_index('Código da loja')).reset_index()

Reordering the columns:

novo_df = novo_df[['Código do produto','Nome do produto','preço','Código da loja','Nome da loja']]

Exit:

novo_df

Código do produto   Nome do produto                                  preço      Código da loja  Nome da loja
          1A        Corretivo líquido 18ml água 930761 Bic            5.70               100        Penha
          2A        Caneta esferográfica 1.0mm cristal                1.40               100        Penha
          3A        Lápis plástico preto evolution Pijama 1106666 ... 4.00               100        Penha
          2A        Caneta esferográfica 1.0mm cristal                1.38               101        Lins de Vasconcelos
          1A        Corretivo líquido 18ml água 930761 Bic            5.75               101        Lins de Vasconcelos
...

About the Join:

Join columns with another Dataframe in the index or in a key column. Efficiently join multiple Dataframe objects by index of one time by passing a list.

Documentation

Update

As the requirement of the question changed I will put this update here.

Importing the package

import pandas as pd

Loading the files

lojas = pd.read_excel('./lojas.xlsx')
produtos = pd.read_excel('./produtos.xlsx')

Creating new data frame and reordering column presentation

novo_df = produtos.set_index('Código da loja').join(lojas.set_index('Código da loja')).reset_index()
novo_df = novo_df[['Código do produto','Nome do produto','preço','Código da loja','Nome da loja']]

Creating a "filter" to check the lowest prices grouped by product name and product code, then create a new data frame with the filtered values

filtro = novo_df.groupby(['Código do produto','Nome do produto'])['preço'].min()
menor_preco_df = novo_df[novo_df['preço'].isin(filtro)].sort_values(by = ['Código da loja']).reset_index(drop = True)

Here we create two data frames saving different information and then create a dictionary, the intention is to create a column with lists

df1 = menor_preco_df.groupby(['Código da loja','Nome da loja'])['Nome do produto'].apply(list).reset_index()
df2 = menor_preco_df.groupby(['Código da loja','Nome da loja'])['preço'].apply(list).reset_index()
df2.drop(columns = {'Código da loja','Nome da loja'}, inplace = True)
dicionario1 = pd.concat([df1,df2], axis = 1).to_dict('index')

Here prints the first part of the problem

for chave, valor in dicionario1.items():
    print(f"{valor['Código da loja']} - {valor['Nome da loja']}")
    for v,p in zip(valor['Nome do produto'],valor['preço']):
        print(f"   Produto: {v} - R${p}")
    print('')

Exit:

100 - Penha
   Produto: Lápis plástico preto evolution Pijama 1106666 Bic BT 3 - R$4.0

101 - Lins de Vasconcelos
   Produto: Lápis plástico preto evolution Pijama 1106666 Bic BT 3 - R$4.0

102 - Curuça
   Produto: Corretivo líquido 18ml água 930761 Bic - R$5.5
   Produto: Lápis plástico preto evolution Pijama 1106666 Bic BT 3 - R$4.0

103 - Faria Lima
   Produto: Corretivo líquido 18ml água 930761 Bic - R$5.5

104 - Jardim Brasil
   Produto: Corretivo líquido 18ml água 930761 Bic - R$5.5
   Produto: Caneta esferográfica 1.0mm cristal - R$1.2

Here we create the second dictionary grouped by product name and price

dicionario2 = menor_preco_df.groupby(['Nome do produto','preço'])['Nome da loja'].apply(list).reset_index().to_dict('index')

And here we print out the second part of the problem

for chave, valor in dicionario2.items():
    print(f"{valor['Nome do produto']}")
    print(f"  - Produto encontrado por R${valor['preço']} nas lojas",', '.join(valor['Nome da loja']),'\n')

Exit:

Caneta esferográfica 1.0mm cristal
  - Produto encontrado por R$1.2 nas lojas Jardim Brasil 

Corretivo líquido 18ml água 930761 Bic
  - Produto encontrado por R$5.5 nas lojas Curuça, Faria Lima, Jardim Brasil 

Lápis plástico preto evolution Pijama 1106666 Bic BT 3
  - Produto encontrado por R$4.0 nas lojas Penha, Lins de Vasconcelos, Curuça

Complete code

import pandas as pd

lojas = pd.read_excel('./lojas.xlsx')
produtos = pd.read_excel('./produtos.xlsx')

novo_df = produtos.set_index('Código da loja').join(lojas.set_index('Código da loja')).reset_index()
novo_df = novo_df[['Código do produto','Nome do produto','preço','Código da loja','Nome da loja']]

filtro = novo_df.groupby(['Código do produto','Nome do produto'])['preço'].min()
menor_preco_df = novo_df[novo_df['preço'].isin(filtro)].sort_values(by = ['Código da loja']).reset_index(drop = True)

df1 = menor_preco_df.groupby(['Código da loja','Nome da loja'])['Nome do produto'].apply(list).reset_index()
df2 = menor_preco_df.groupby(['Código da loja','Nome da loja'])['preço'].apply(list).reset_index()
df2.drop(columns = {'Código da loja','Nome da loja'}, inplace = True)

dicionario1 = pd.concat([df1,df2], axis = 1).to_dict('index')
for chave, valor in dicionario1.items():
    print(f"{valor['Código da loja']} - {valor['Nome da loja']}")
    for v,p in zip(valor['Nome do produto'],valor['preço']):
        print(f"   Produto: {v} - R${p}")
    print('')

dicionario2 = menor_preco_df.groupby(['Nome do produto','preço'])['Nome da loja'].apply(list).reset_index().to_dict('index')
for chave, valor in dicionario2.items():
    print(f"{valor['Nome do produto']}")
    print(f"  - Produto encontrado por R${valor['preço']} nas lojas",', '.join(valor['Nome da loja']),'\n')