Pandas read_html is joining the information by adding the information from the bottom line to the line above in merged lines

Asked

Viewed 8 times

-1

I’m new to python and maybe it’s obvious the answer, but come on:

I need to work with a file downloaded from a website that comes in xls format but is originated in html. So I need to use read_html to print the file on the screen. but the problem is that there are mixed cells where in these cells the pandas joins the information of the line below in the line above, maybe it is native to pandas to automatically debug and already do that merge but I need the information printed first empty so I can use ffill() and drag the information from the top line to the bottom line.

below follows the table I have after starting printing it in pandas.read_html and then the table as I would like to see in the final result

inserir a descrição da imagem aqui

now below would be the final result so that could continue data processing inserir a descrição da imagem aqui

I really appreciate anyone who can help me

  • Please clarify your problem or provide additional details in order to highlight exactly what you need. The way it’s written these days it’s hard to tell exactly what you’re asking.

No answers

Browser other questions tagged

You are not signed in. Login or sign up in order to post.