Most voted "pandas" questions
Pandas is an open source library, which provides high performance data structures and data analysis tools for the Python programming language.
Learn more…646 questions
Sort by count of
-
2
votes1
answer2323
viewsMounting table from a csv, grouped by week, with python and pandas
I am using pandas and open the following table using the code tst = pd.read_csv('Iteracao.csv',delimiter=",") I’m trying to group as follows, where week 1 is the week of the date…
-
2
votes1
answer428
viewsPython pandas very slow
Can anyone help me? I am reading a file, make some changes and then saved in another folder. but that takes 2 hours, the file has 15 million lines, would have some different and more effective…
-
2
votes2
answers1120
viewsHow to keep zero left on import in Python?
Hello I have several TXT files with CPF numbers. CPF has 11 numbers, so you can have zeros left to complete this size I opened these files in Excel and gathered them in one. In the CPF column,…
-
2
votes1
answer1399
viewsCreating a bar graph to compare data
I’m taking the Science Data Course from Udacity. It’s my first contact with programming, so don’t judge the silly mistakes haha. I’m comparing two data frames with information from cars of the year…
-
2
votes1
answer397
viewsSize of tuple lists in a df
I have the following df n_words Words . 220 [('trabalho', 17), ('monitor', 17), ('via', 16... 3114 [('atend', 863), ('ortopedico', 863), ('proced... 5 [('anomalos', 2), ('feixes', 1),…
-
2
votes1
answer129
viewsReturn in a Dataframe - Python
Good afternoon. I have a question about Python. I have an if where he has the conditional and Else, the Else he processes more than one file and I need to save all the information he reads inside a…
-
2
votes1
answer95
viewsProblems with incorrect data in a dataset (using Pandas)
I have a dataset called Auto.csv, which has the form: mpg cylinders displacement horsepower weight acceleration year origin name 18 8 307 130 3504 12 70 1 chevrolet chevelle malibu 15 8 350 165 3693…
-
2
votes1
answer806
viewsPandas: Dataframe information comparison
I have 2 Dataframes imported from CSV CSV1 4616; CCIVIL_03/decreto/2003/D4616.htm 4617; CCIVIL_03/decreto/2003/D4617.htm 4618; CCIVIL_03/decreto/2003/D4618.htm 4619; CCIVIL_03/decreto/2003/D4619.htm…
-
2
votes1
answer176
viewsstring treatment on pandas
Good evening Gentlemen, I have the following Dataframe, imported from an xlsx. NUM_LEGISLACAO DSC_URL COD_SITUACAO ... DSC_TIPO num ano 264 89.272/1984 NaN 2.0 ... NORMATIVO 89.272 1984 265…
-
2
votes1
answer7298
viewsHow to save in CSV or Excel a table generated from another table with pandas or pivot table?
I have a table with data of several years from 2000 to 2015 in CSV format. In my code I ask the user to enter a year he wants to see and return on the screen only the years he requested. Ex.: 2000.…
-
2
votes1
answer899
viewsConvert Days & Time (Hours x Minutes x Seconds) to Time only
I have a Dataframe in which I am making the difference between two different dates to get the difference in Hours and Minutes, for example: data_inicial = '2018-07-03 16:03:00' data_final =…
-
2
votes1
answer7854
viewsHow to insert a line in a Dataframe Pandas in the middle of other lines?
I have an output of sensor data that has the following desired structure: --- Beginning --- $LAGM,Colar03,Yellow,32262,-31226,-5120,-104,40,190,1662.00,1670.00,236.00,MGAL…
-
2
votes1
answer3034
viewsHow to customize the x-axis of a graph with two y-axes, for text?
I would like to know how I change my x-axis from a graph with two y-axes, because I want to give the x-axis names of Brazilian states. numpy_matrix = df.as_matrix() x = numpy_matrix[0:,0] y1 =…
-
2
votes1
answer3253
views(Pandas) - Group and summarise by date
Hello, I’m a beginner in pandas and I caught in a problem that I didn’t find/understand how to solve in the documentation or other topics. Briefly I need to group the days of the observations from…
-
2
votes1
answer487
viewsConcatenating equal data from a column into rows in Python
What I need is to find a function in Python that mounts my file this way. I already searched in pandas and did not find.…
-
2
votes1
answer540
viewsSeparating a dataframe by some python pandas criterio
I have a database that has 789 reviews of people on a particular product, it has the columns reviews and stars. I normalized the data to positive (star >= 3) 1 and negative 0. outputs =…
-
2
votes1
answer842
viewsPandas Settingwithcopywarning: A value is trying to be set on a copy of a Slice from a Dataframe
I want to copy an element of a dataframe and insert it into another dataframe. In essence there is a dataframe with name x area and another that I need to load with the data of the area, from the…
-
2
votes1
answer1375
viewsScalar product
Objective: To make a neuron using the load of weights and inputs from an xlsx The scalar product has been calculated in several ways as exercise, but when I try to use the dot gives error. the…
-
2
votes2
answers3831
viewsDataframe Pandas - Calculate column based on other
I have a dataframe in the following format: colunas = [ 'COMEDY', 'CRIME', 'Classe Prevista' ] precisao_df = pd.DataFrame(columns=colunas) precisao_df['COMEDY'] = y_pred_proba[:,0]…
-
2
votes1
answer348
viewsPandas: Subtract dates into grouped indexes in a dataframe
I am conducting a research using pandas and need to infer the time between two buses based on their start times (start_time). For this I have grouped in my dataframe a field for itinerary and…
-
2
votes1
answer338
viewsSignal change of values in Pandas Dataframe
Hello, good night. I have a set of vectors whose components (px, py and Pz) are stored in a Dataframe Pandas. I wrote a function whose goal is to change the signal of the components of the vectors…
-
2
votes1
answer166
viewsHow to use pandas styling properly?
Python code From the examples of documentation the following code has been created: import pandas as pd import os import webbrowser import io def highlight_max(s): ''' highlight the maximum in a…
-
2
votes2
answers199
viewsHow to reduce formula . replace in Python
I’m with a dataframe where you would like to replace the 0.1 encoding by yes and no. Some columns of the df have this encoding and so I wrote the following command: dados_trabalho =…
-
2
votes2
answers1020
viewsHow to balance classes in a machine Learning regression problem with Python?
Problem using the dataset of the book "Hands-On Machine Learning with Scikit-Learn and Tensorflow" https://github.com/ageron/handson-ml dataset of house prices. Objective: to create a model of house…
-
2
votes2
answers1707
viewsBrazilian calendars using Pandas-python
How do I import the holiday calendar from Brazil using the pandas library in python. For example, if I use the following code: from pandas.tseries.holiday import * feriados=…
-
2
votes1
answer149
viewsRegular expression case insensitive in a dataframe
I have a dataframe with some tweets that were collected according to the keyword. How do for example to extract at once only the lines with #flamengo and all its variations, such as #Flamengo,…
-
2
votes1
answer517
viewsExtract phone number with API in Python pandas
I have an API that extracts the phone number. It works as follows. By passing a number on it, returns me 3 variable type string containing phone with country code, type if it is cellular or fixed…
-
2
votes1
answer154
viewsValidate date as holiday or not
Hello, good afternoon, sir! I am creating a Dataframe in which I need to validate whether a day is a holiday or not, so I created a time series in hand and made two loops to validate whether the…
-
2
votes1
answer1184
viewsManipulating 3 GB Dataset with Pandas using Chunks
Hello! I’m trying to work with a *.csv file using Pandas in Python3 embedded in a Googlecloud VM. This VM has 16GB of memory and even so gave Memoryerror. To solve this problem I used the attribute…
-
2
votes1
answer1066
viewsPandas Excel Writer
I have a problem trying to write an excel file with pandas. When I try to write it presents the following message. ZIP does not support timestamps before 1980 Thinking that could be some problem…
-
2
votes1
answer467
viewsHow to go through all the data of a DF making calculation and return the value for it?
I am with DF after some calculations, but I am not able to pass the data to an account, every time gives this error: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(),…
-
2
votes2
answers35
viewsHow to apply a ribbon to a dataframe based on the last characters of each label?
I need to apply a line filter to a dataframe based on the last characters of a label, if it contains BRL after the hyphen. Ex.: BTC-BRL. Can someone help me? Current data: Data after applying…
-
2
votes1
answer433
viewsPython Pandas: Dataframe convert Timestamp column to Datetime
How can I convert a Dataframe column from Timestamp to Datetime? btc_df = pd.DataFrame ( bars_day , columns=['date' , 'volume' , 'open' , 'high' , 'low' , 'close'] ) Summary date int64 volume object…
-
2
votes0
answers281
viewsNeed to change a dataframe column according to the value found in another column in Python. How to do?
I need the value of a column PERTMINIST is changed from a set of words found in the column SIGLA. For example you can have in the column NOME_ORGAO the words UFJF, UFTM and update in the same row in…
-
2
votes2
answers263
viewsIn Python, search a string from the information of a column?
I’m at Jupyter Notebook working with Python. Has a dataframe with the name and text fields, in the text, which is a txt loaded, I want to search if there is the string that is exactly the value in…
-
2
votes1
answer23
viewsHow can I line a Pandas dataframe from a list that contains variables?
So guys, I’m having a problem that basically works in a simplified way like this # Eu criei um DataFrame a partir do pandas import pandas as pd Colunas = ["A","B","C","D"] df = pd.DataFrame(columns…
-
2
votes3
answers2524
viewsDivide date (day, month, year) into new columns - Dataframe Pandas
I have a DataFrame and needed to give a split in his date field to later add month and day columns. The problem is that the field data Dataframe is not the type str, so I can’t use the method split.…
-
2
votes1
answer185
viewsSearch within csv and bring other row columns - Python
Well I have a Python application that generates a csv file of 7000 lines and 4 columns, for example: Mesa,Entrada,Saida,Conta "P",21:00,22:00,95.00 "A",14:00,18:00,195.00 "C",18:00,21:00,75.00…
-
2
votes2
answers343
viewsGetting maximum value of each grouping with groupby pandas
Hello, i have the DF below which I would like to group by 'country' and get the maximum population value: df = pd.DataFrame({'pais': ['Brasil', 'Brasil' , 'EUA', 'EUA'], 'cidade': ['Santos', 'São…
-
2
votes1
answer938
viewsTake values from a column of a dataframe and create a column in another with the corresponding values
I have two df1 and df2 dataframes, both have team column, but only df2 have the numeral column that every day changes, wanted the values of the numeral column to turn a column in df1, but always in…
-
2
votes1
answer141
viewsPerform data counting based on grouping 2 or more columns in a Pandas.Dataframe
I want to find in this Dataframe, the 3 best user_id for Nome prova. That is, those who have the highest number of values 1 in the correct column ( such column is composed of values 0 and 1 ): Nome…
-
2
votes3
answers723
viewsPython: conditional sum with variable condition
good morning. The task is as follows, I am trying to make a 'a,b,c' curve of products per company. I have the following df as an example: df = pd.DataFrame({"empresa":["HST", "HST", "HST", "HSC",…
-
2
votes2
answers41
viewsHow to get the X coordinate if the Y condition is met with Numpy?
I created a function that gives me this array as output in which each line corresponds to a point : array([[0.57946528, 2. ], [0.35226154, 0. ], [0.26088698, 0. ], [0.56560726, 1. ], [0.41680759, 1.…
-
2
votes2
answers258
viewsHow to divide the value of an element of a column by delimiter (p.e "|") in pandas?
Let’s say we have the following column: coluna1 ola | 52 hey sou ja da |5 24g The expected output would be: coluna1 ola 52 hey sou ja da 5 24g So far, I’m trying string manipulation with split and…
-
2
votes1
answer467
viewsTurn row into column
Good night, I’m having a problem of putting together more than 500 csv files, where I need a very important data that is in a row (A8), but it needs to turn column (like this in the second image),…
-
2
votes1
answer145
viewsdefine average function with pandas
I have a csv file that measures the temperature of every day of the year and I need to calculate the monthly average. I created a loop and even that worked but I wanted to know if there is a way to…
-
2
votes1
answer143
viewsData Frame with information from the Central Bank of Brazil
I created a function to collect data from the Central Bank by separating by variables that I want to work. Now I wanted to create a data frame that had all this information, but I’m not getting it,…
-
2
votes1
answer48
viewsIncorrectly created chart
Hello, I am training in creating graphics in Python with matplotlib. For this, I am importing an HTML with WEGE3 action history. df_history =…
-
2
votes1
answer107
viewsHow to load . txt files from a directory including in the list or dataframe the name of this file in Python?
I’m at Jupyter Notebook working with Python. I have a directory with files in txt, I can iterate in the directory and load these files in txt, however, I also need to take the name of this file as…
-
2
votes2
answers116
viewsOptimization through dataframe (Pandas)
I need to compare two. csv files for inconsistencies. The boleto.txt file contains information about the boletos issued by a company. This file has 500,000 lines. The.txt file contains information…