How to compare two Python data frames

Asked

Viewed 962 times

-1

Hello guys I am trying to automate a loading process with python bringing data from sql and sending to a mysql table and I am facing some problems. Problem: I need that when I have new data in my sql table my program check if this data is present in my mysql table, if it is present print on the screen that the data already exists and if there are no data to be loaded in my sql table. I thought as follows to play the data of the two tables in data frame and then to make decision structure in those two data frames, but I’m having difficulties. Follows the code.

Error-Exception has occurred: Keyerror 'idContract'

import pymysql.cursors
import pyodbc
import pandas as pd
from sqlalchemy import create_engine

# Função para trazer os dados do Sql e jogar em um dataframe
def get_vendas_sqlserver():
    #Conexão com SQL Server
    connection = pyodbc.connect("DSN=SQLServer")  #autocommit=True
    try:
        # O metodo do pandas retorna o resultado do select
        df = pd.read_sql_query("SELECT * FROM dw.dbo.vW_Vendateste123",connection,index_col=None,coerce_float=True, parse_dates= 'DataBaseContrato') 
        return df      
    finally:
        connection.close()

# jogando a função do sql dentro de uma variavel
sqlvendas = get_vendas_sqlserver()
print(sqlvendas)

# Função para trazer os dados do mysql e jogar em um dataframe
def get_vendas_mysql():
    #Conexão Mysql
    cnxmysql = pymysql.connect(host='reveka',
                                 user='carga',
                                 password='carga',
                                 db='dw')
    try:
        # O metodo do pandas retorna o resultado do select
        df = pd.read_sql_query("SELECT * FROM ft_venda_copy", cnxmysql, index_col=None, coerce_float=True,
                               parse_dates='DataBaseContrato')
        return df
    finally:
        cnxmysql.close()

# jogando a função do mysql dentro de uma variavel
"""mysqlvendas = get_vendas_mysql()
print(mysqlvendas)"""

# Comparando os dois dataframes

df1 = get_vendas_sqlserver()
df2 = get_vendas_mysql()
#criando um dataframe vazio para guardar o resultado
df_result = pd.DataFrame()

cnx = create_engine('mysql+pymysql://teste:teste@teste/dw')

# I tried using the iterrows method to return a tuple

for index, row in df1.iterrows():
    # verifica se este registro existe no DF do MySQL

Solution - Fixed: idContract to Idcontract

    if row["IdContrato"] in df2["IdContrato"]:
        print("Contrato {0} encontrada no Mysql".format(row["IdContrato"]))       
    else:
        print("Contrato {0} nao encontrada no Mysql".format(row["IdContrato"]))
        # adiciona o registro num novo DF que vai ser usado para gravar no banco adiante
        df_result.append(row)
        #carregar_resultados(df_result)
        df_result.to_sql(con=cnx, name='ft_venda_copy', if_exists='append', index=False, )
        print('dados Carregados')
  • 1

    You have a typo. Ora uses Idcontract, capitalized, and sometimes uses idContract. Correct that mistake and then come back here if you need anything else.

  • Opa Rafael thanks, I will correct and perform the test!

  • Thanks Rafael was this very problem

1 answer

1


You have a typo.

Ora usa IdContrato, with i capital, and now uses idContrato, with i minuscule.

for Rafael Barros

Browser other questions tagged

You are not signed in. Login or sign up in order to post.