Most voted "pandas" questions
Pandas is an open source library, which provides high performance data structures and data analysis tools for the Python programming language.
Learn more…646 questions
Sort by count of
-
0
votes1
answer486
viewsHow to open a . sql file in pandas?
I intend to make a dataframe of a database that I imported from pgadmin4 as a 'vialactea.sql' file when I try to execute the pandas command in jupyter only from the error message. The bank is saved…
-
0
votes0
answers130
viewsfloat64 in the Pandas
I have a table 'f0219' with a column 'Value'. Originally, value is like Object. I use the following command to convert it to float. f0219['Valor'] = pd.to_numeric(f0219['Valor'], errors='coerce')…
-
0
votes1
answer33
viewsExluir text using python on pandas
I am manipulating a dataframe where a field has contents as for example: 123.456[7]. I need to get the start character sequence of this field without the substring [7], and then remove the . and…
-
0
votes2
answers415
viewsRename a graph label in Python Matplotlib
I have a dataframe filter: F 3257703 M 2256044 with the code below I was able to display the graphic in pizza: porcentagemSexo = sexo.value_counts(normalize=True) rotulo = sexo.unique()…
-
0
votes1
answer14334
viewsDataframe - Pandas. Creating new column after comparison between columns
I have the following df I need to add the code of the CODIGO column in the cod_city column in cases where the value of the ds_city column equals the LOCALITY column. I’ve tried that way, but it…
-
0
votes1
answer251
viewsAdditional filter for groupby
The following code groups my DF by some columns f0219.groupby(['Matricula', 'Nome', 'Rubrica', 'Valor', 'CodigoRendimentoDesconto', 'Tiporubrica']).Rubrica.count() and counts how many times the…
-
0
votes1
answer126
viewsValues are not converted to np.Nan
I have the function: def zero_to_null(x): if x == 0: x = np.nan return x Uso series.apply: Series.apply(zero_to_null) However, the values 0 are not converted to np.Nan: Series.value_counts() 0 35 2…
-
0
votes1
answer292
viewshow to get the minimum average and maximum time of a timeseries with pandas, based on a column where the value is boleano?
I’m analyzing a timeseries returned with pandas ex: index----valor------timestamp <br> ----0 -------0-------- 2019-04-23 16:14:34.142540+00:00<br> ----1 -------0-------- 2019-04-23…
-
0
votes2
answers2625
viewsInner Join with two keys with python
I have two tables, one for consultation and one for examinations, both with the date of execution and the code of the beneficiary. Want to make a third table with people who have had exams and…
-
0
votes2
answers3879
viewsHow to separate the year from a date with Python and Pandas?
I have a student database with a Ticket Date column in the dd/mm/aa format. I need to generate an Ano_ticket column only with the year of the date of each record. import pandas as pd df =…
-
0
votes1
answer115
viewsI can’t read column on pandas
I have this Dataframe: Nome CPF ... Senha Cargo 0 Silvio José 10575674451 ... 12345 Administrador 1 Carlos Alberto 10767764330 ... 12345 Administrador 2 Maria Madalena 23323234343 ... 12345…
-
0
votes1
answer83
viewsUpdate data from two worksheets without overwriting in the source file
I read the file sheets in the following way: sheet_Pessoas = pd.read_excel("meus_dados.xlsx", sheet_name=0) sheet_Clientes = pd.read_excel("meus_dados.xlsx", sheet_name=1) Then I add a line in the…
-
0
votes1
answer61
viewsDoubt in models Predict
I think it’s a simple question, but in all the courses I’m taking the instructor teaches you to separate training and test data from a csv or some base. But I want to test with the user input…
-
0
votes1
answer34
viewsGraphic multindex pandas
Hello, I have the following dataframe: Where I have a period of 3 years. I would like to be filling a chart of agupado VALUE per month and year, I did as follows, but it did not work.…
-
0
votes2
answers76
viewsHow do I remove alphabetic characters from a column of a pd. Series? (Python)
My question is quite simple. Given a pd.Series as described below, how do I remove "YEARS" and "MONTHS" characters from it? I looked at the documentation of Pandas but unfortunately I could not find…
-
0
votes1
answer2511
viewsHow to loop a dataframe by conditionally removing lines and restarting the loop in a recalculated dataframe at each removal?
Hello, I am performing some data processing and I need my algorithm to perform the comparison between two lines and remove the worst one, according to some conditions. Each time the algorithm…
-
0
votes1
answer79
viewsBy loading my xlsx file into pandas, the rows have become column indexes. How to set a new input for columns?
I would like to know how to set new index for columns. data = pd.read_excel('numero_automoveis_vendidos.xlsx') data.columns Index([7, 5, 9, 11, 10, 8, '9.1', 6, '8.1', '10.1'], dtype='object')…
-
0
votes2
answers63
viewsEdit date value within a dataframe
I have this Dataframe and the date is in the following format: '2020-08-01' I would like to write a loop that iterates on each line and performs the switch to the following format: '01/08' I’m using…
-
0
votes0
answers52
viewsProblem when recording data
I’m performing a query on a table where the structure is : Documento Produto 123 Camisa Posh rosa 127 Calça Handara 36 127 Meia Barby 158 Calça Handara 38 129 Blusa Yoll M 129 Blusa Yoll azul P 129…
-
0
votes2
answers106
viewsHow to check if a certain value of a column is equal and does not save this value in file?
Good afternoon! I’ve been in a jam for days because I can’t seem to make sense of the problem I’m having. I’m making a query to record data in a type str in python. This query goes through the query…
-
0
votes0
answers383
viewsDataframe: Empty 'Dataframe': no Numeric data to Plot
I’m trying to get the highest values in the column Valor Pago. No Jupyter. top_10_empenho = portal[['Descrição', 'Valor Pago']].head(10).set_index('Descrição').sort_values('Valor Pago',…
-
0
votes1
answer531
viewsPython graph does not show all values on the x-axis
I used the pandas library to read a csv file and create a graph using matplotlib: import pandas as pd import matplotlib.pyplot as plt brazil_dataset = pd.read_csv('/content/states.csv') fig, ax =…
-
0
votes1
answer492
viewsAttributeerror: 'Nonetype' Object has no attribute 'Insert' - Trying to insert a column
Good morning ! I’m new in python and I’m trying to build a script that deletes some lines and includes a column in a dataframe only that for some reason I can’t understand it returns this error.…
-
0
votes1
answer112
viewsHow do I delete lines by cell specific content in pandas
I am preprocessing my data using the Python pandas library. This is a project to train an algorithm to predict "roles" This is the result I get when I run. print(vagas.role_name.value_counts())…
-
0
votes3
answers1079
viewstransform list items into separate columns or extend dataframe to the end
I have a class with an element that is a list I’m trying to display in a pandas dataframe this list in a single line to represent the character’s inventory. assignment of items in the list: if…
-
0
votes2
answers166
viewsAND OR PANDAS CONDITION
I have a question to configure the filters "e" and "ou" in the pandas. Example: A column with the following col1 values (yes, no, maybe) to get the yes OK df[df['col1']=='sim'] Now, to catch: yes or…
pandasasked 4 years, 2 months ago user201087 -
0
votes1
answer322
viewsExtract data from all rows of a file and create a dataframe
I have a file . txt with 2000 lines (Whatsapp chat) from where I need to extract to a pandas dataframe the date, time and sender of the message. I can do this with the function below: def…
-
0
votes2
answers434
viewsGet the maximum value of each row in a grouped pandas dataframe
I have a pandas dataframe with UF, Municipio, Classe_acidente, Total. In this dataframe each municipality appears three times, one for each accident class (there are 3 classes) and I need to obtain…
-
0
votes1
answer83
viewsHow to Insert in Postgresql based on data from a Dataframe
I am trying to make an Index of the data I have in a Dataframe (regastei via API GET) for a table I created in Postgresql. I searched a lot but I could not get a return. Someone can help me?…
-
0
votes0
answers190
viewsInsert data into Postgresql using Dataframe Pandas
Prazados, I am working on a particular project in which I want to do an ETL with Python and Postgresql. I intended to consume a GET API using request, persist this data in a Dataframe and after…
-
0
votes1
answer1645
viewsCumulative sum per line
Good afternoon colleagues I would like a help. In the code below i Gero a new column (accumulated) using cumsum. The result is a cumulative sum for each row. However I need to bring the accumulated…
-
0
votes2
answers341
viewsProblem with pandas in pycharm
I am using pandas to import an excel file, however, the following error appears: Cannot find Reference 'read_excel' in 'pandas.py' import pandas as pd dados =…
-
0
votes1
answer88
viewsHow to treat columns with the same name in csv file
I have the following problem, I received a database in csv file and I need to analyze this data, but it has a lot of columns with the same name and I wanted to know if you have any way to treat…
-
0
votes2
answers43
viewsdoubts in the creation of dataframe pandas
I have a question in creating a dataframe as an example dicionario = {'pais': 'brasil', 'capital': 'Brasilia', 'climate': 'tropical'} pd.Dataframe(dictionary) back index error but put some value on…
-
0
votes2
answers78
viewsDoubt replace in pandas
Doubt in replace in dataframe pandas texto = 'Vírus de computadores são uma lenda urbana.' dado = {'texto': [texto]} df = pd.DataFrame(dado) df['nova_coluna'] = df.texto.str.replace('urbana', '') By…
-
0
votes1
answer69
viewsReturn value based on date criteria within start and end date range and ID in another data frame? (pandas and python)
I have two dataframes, where the df1 contains the column Data and Praca_ID, where I need to look in the df2, who owns DataInicial DataFinal Praca_ID and Tarifa (column to be returned.) The example…
-
0
votes0
answers275
viewsSplit dataframe according to number of lines
I wonder if it is possible to separate in a way a date frame of +- 300000 lines in pre-set amounts of lines and save in xlsx as the headers each part. I thought of using Loc, but I would have to use…
-
0
votes1
answer219
viewsHow to take only the time of a Timestamp
Using the command below: pd.to_datetime(1490195805, unit='s') The return is: Timestamp('2017-03-22 15:16:45') How do I extract only the time: 15:16:45?…
-
0
votes1
answer351
viewsCompression and Reading of CSV file with large-scale rows x columns via Pandas
I am stuck looking for a precise and intuitive way to read a file of 70.000KB formed by the concatenation of several files being them with varied sizes. Initially possessing several files in the…
-
0
votes1
answer222
viewsPandas package not recognized in notebook jupyter
Hello, can anyone help understand why the notebook jupyter does not recognize the panda package? created a virtual environment and installed the packages to analyze a set of Enem data, but when…
-
0
votes1
answer243
viewsError importing pandas into Jupyter Notebook
I can’t care what pandas in Jupyter Notebook. Error shows this message: Runtimeerror: The Current Numpy installation ('c: users Augus analise_data lib site-Packages numpy\init.py') fails to pass a…
-
0
votes1
answer107
viewsProcessing Complexity in a Pandas Dataframe
I need to deal with a base junction problem in the Python language. I have three layers of folders that I need to enter, find the file and merge into a dataframe. The layers being: year, month and…
-
0
votes1
answer635
viewsHow to get n working day of the month?
I’m creating a data model for Machine Learning to get a transaction amount forecast per year/month/day/hour. But I am needing to map the 5th and 10th working day of the month so that my algorithm…
-
0
votes1
answer51
viewsHow to name multiple dataframes automatically in Python?
I need help automating a dataframe naming process that I’m applying. I imported 3 csv files and applied the code below, which worked smoothly: Loja_1, Loja_2, Loja_3 = (pd.read_csv(cont) for cont in…
-
0
votes1
answer1058
viewsRename columns nuḿericas in Pandas
I imported an Excel file into a f0719 dataframe. Except that Pandas added numbers, which did not exist in the original file, as identification of the columns, instead of the name, which was just…
pandasasked 5 years, 3 months ago roger roger 124 -
0
votes1
answer420
viewsHow to use pandas.read_sql_query within a class
I created a class to facilitate the use of the library psycopg2 responsible for connection to the Postgresql database. import psycopg2 as pg class Postsql: def __init__(self, Phst, Pusr, Ppwd,…
-
0
votes1
answer132
viewsHow to handle "Nan" values returned from a Dataframe
I wrote the code below to select 1 column of each CSV file, but it returns all values like NaN. How I treat it so it returns the right values? import pandas as pd df1 = pd.read_csv("CSSS.csv",…
-
0
votes1
answer815
viewsSave array to python CSV
I wrote the code below to store some data in a dataframe, soon after I converted it into an array, but when saved the same in CSV file, the same jumps lines in saved file import pandas as pd import…
-
0
votes1
answer81
viewsGenerate a python array of multiple variables
Hello! I am trying to generate a function that reads imported data from a "csv" file and converts it to an array. The data of one column is integer, and of the other column is an intercalation of…
-
0
votes0
answers150
viewsMaking operations inside the Dataframe
I’m trying to do the following thing with this code import pandas as pd from datetime import date # Criando uma série no pandas que represente a indexação das linhas como valores temporais # Por…