Comparing two python data frames

Asked

Viewed 407 times

0

Good afternoon gentlemen.

I am new to python, and to programming. I am trying to compare two different data frames, each with the data of cars of a specific year. For now I’ve done the following:

year_08 = np.repeat('2008', 1061) # Variável com a quantidade de linhas do dataframe de 2008
year_18 = np.repeat('2018', 832) # Variável com a quantidade de linhas do dataframe de 2018

df_08['year'] = year_08 # Criando uma nova coluna com o ano no dataframe de 2008
df_18['year'] = year_18 # Criando uma nova coluna com o ano no dataframe de 2018

df = df_08.append(df_18, ignore_index=True) # Juntando os dois dataframes

I could also see the increase from one year to the next:

fuel01 = df
fuel01 = fuel01[fuel01.fuel != 'Gasoline'] # Exclui as linhas que tem 'Gasoline' como dado
fuel01['year'].value_counts()

I just didn’t like the way it looked, I wanted something, let’s say, more professional. Any idea how to do it?

Thank you.

  • Could you explain your data better and how you want to compare them? > I tried to use np.repeat but it didn’t work because I have values that are strings (...) If there is a possibility, you can drop the values that are strings you would do something like: df = df[not isinstance(df.column,str)]

  • Good morning. So I’m comparing two dataframes, one with car information for the year 2008 and one with car information for 2018. In this question I need to compare whether there has been an increase in the use of alternative fuel sources from one year to the next. I did the following: I created a repeat variable with each year, then added a column in the dataframes named 'year' and then merged the two frames together. Now I have a single dataframe that has a column that shows what year that car model is. I’ll edit in my question so you understand better.

No answers

Browser other questions tagged

You are not signed in. Login or sign up in order to post.