Posts by Terry • 889 points

42 posts

1
votes

1
answer

48
views

A: How to Capture Data with Pandas?

I believe that what you seek can be achieved with reindex and repeat: df.reindex(df.index.repeat(df.FREQUENCIA)) Unfortunately I have no way to demonstrate the output of the data because you…

python database pandas powerbi
answered 4 years, 2 months ago Terry 889
0
votes

2
answers

41
views

A: Compare lines Dataframa Pandas

I do not know if I understood correctly what is the expected output, but I believe that to calculate the difference of the smallest date in each number with the other dates, you will need to use the…

pandas
answered 4 years, 2 months ago Terry 889
3
votes

1
answer

40
views

A: How to turn every two records of a Datframe into a single in Python based on two columns

One way would be to divide the DF into 2, between patrons and visitors, rename the columns of each DF according to your wishes and with merge unite them again, in this way: mask = df['mandante'] ==…

python pandas
answered 4 years, 2 months ago Terry 889
0
votes

1
answer

70
views

A: How to join three or more CSV files with something like PROCV and concatenating certain columns

My solution proposal involves the use of functions concat, groupby, map and drop_duplicates. First merge all your files with concat, with the groupby for Code concatenate the column strings Year and…

python pandas
answered 4 years, 3 months ago Terry 889
2
votes

1
answer

97
views

A: How to compare two string values with pandas?

Try using the function isin, in this way: mask = cursos['codigo_unidade_ensino'].isin(unidade_ensino['cod_unidade_ensino']) cursos = cursos[mask].copy()…

python excel pandas
answered 4 years, 3 months ago Terry 889
3
votes

2
answers

88
views

A: Replacing Nan values with the subsequent not Nan of another column

It is possible to create a temporary Series with column values only groundwork where values is not null with commands .mask, .isna and.bfill. With this Series in a variable it is possible to pass it…

python pandas
answered 4 years, 5 months ago Terry 889
1
votes

2
answers

82
views

A: Group closer values in Postgresql

I believe that what you seek is possible using the command .merge_asof() in pandas. It allows it to be possible to join Dataframes by approximation in the data (in your case, by approximate dates).…

python sql postgresql pandas
answered 4 years, 8 months ago Terry 889
2
votes

2
answers

312
views

A: How to fill a column of a DF Pandas using, as a comparison, a specific column between this and another DF?

My suggestion is to make a LEFT JOIN using the function merge(). And for it to work properly it is not necessary to create the column codproject with zeros in Dataframe 'DF' # aqui vou excluir a…

python pandas
answered 4 years, 8 months ago Terry 889
2
votes

2
answers

150
views

A: Cross-reference two different dataframes with different line numbers

You can solve this using the function .map() passing a Series to her, this way: dataset_original = pd.DataFrame({'Grau de instrucao': ['1','2','3','4','5','6','7','8','9','10','11','-1']}) s =…

python pandas
answered 4 years, 8 months ago Terry 889
0
votes

2
answers

509
views

A: Filter data from a Dataframe pandas by a specific column and the last four dates of a set of dates

First it will be necessary to use the shift() and fillna to fill the audience of programs that have only one occurrence. To calculate the average of these programs the function will be used rolling…

python pandas numpy
answered 4 years, 9 months ago Terry 889
2
votes

2
answers

50
views

A: Deleting lines with repeated Labels on a Dataframe

Tends to use the .drop_duplicates() in this way: df = df.drop_duplicates(subset=['B'])…

python pandas
answered 4 years, 9 months ago Terry 889
0
votes

1
answer

381
views

A: Frequency Table with two variables

To do this you will need to use the groupby by the columns Region of Origin and Instruction Degree, use command size to take the size of each of these groups. After this, it is possible to remove…

pandas
answered 4 years, 11 months ago Terry 889
0
votes

1
answer

83
views

A: Complete values in a table, with values of the table itself?

It is possible to make this substitution by grouping by active with the function groupby together with the ffill, in this way: df['data'] = pd.to_datetime(df['data'], dayfirst=True)…

python sql postgresql pandas
answered 4 years, 11 months ago Terry 889
0
votes

2
answers

280
views

A: Replace use of dataframe with pandas . apply

One way to do this would be by using the function .isin() pandas. It returns a boolean list when it finds some value within the array cargos_to_display_photo df['display_foto'] =…

python python-3.x django pandas
answered 4 years, 11 months ago Terry 889
0
votes

1
answer

282
views

A: How to make a cumulative sum in Bigquery?

I think for your problem, you can divide it into 2 parts, first using groupby to find the ids in each month, and then using over to make the cumulative sum. Follow an example with t1 as ( select…

sql query count summing-up bigquery
answered 5 years ago Terry 889
0
votes

2
answers

76
views

A: How do I remove alphabetic characters from a column of a pd. Series? (Python)

You can do this using the function .split() for space and selecting the first position of the array, after that use the function value_counts(). data = {'Idade': ['80 ANOS', '80 ANOS', '80 ANOS',…

python python-3.x pandas
answered 5 years ago Terry 889
0
votes

1
answer

531
views

A: How to delete line from a dataframe based on a python list?

Can do using .isin(), in this way: df_saida = df.loc[~df['cod'].isin(lista)] #output: cod letra 0 101 a 2 303 c 3 404 d The .isin() here returns a Boolean list showing which values are found in the…

condition
answered 5 years, 1 month ago Terry 889
0
votes

1
answer

136
views

A: Remove lines less frequently from pandas.dataframe

Combine the value_counts with a head(3).index to create a mask with the elements that most appear in the Dataframe. After, with isin select them. mask = df['variedade'].value_counts().head(3).index…

python pandas
answered 5 years, 3 months ago Terry 889
0
votes

2
answers

301
views

A: python np.Where with two conditions

You could also have used the command .any(1), that would check on each Dataframe line if any of the values is True, this way: x['D'] = (x[['A','B']] > 0).any(1) A B D 0 1 5 True 1 2 0 True 2 3 0…

python pandas numpy
answered 5 years, 5 months ago Terry 889
2
votes

2
answers

343
views

A: Getting maximum value of each grouping with groupby pandas

You can do this using the groupby with idxmax. The idea is to select the indices where the largest population of each country is. df.iloc[df.groupby('pais')['populacao'].idxmax()] #saida pais cidade…

python python-3.x pandas
answered 5 years, 6 months ago Terry 889
0
votes

2
answers

93
views

A: How to create multiple columns using the values of one in pandas?

You can use the function reshape with the transpose numpy without the need to manually write a function or call it several times. pd.DataFrame(df.values.reshape(-1, 10).T, columns=['A','B', 'C',…

python pandas
answered 5 years, 6 months ago Terry 889
0
votes

2
answers

665
views

A: Load multiple concatenated CSV at once in Python

You can do it using glob import glob arquivos = glob.glob('arquivo*.csv') # 'arquivos' agora é um array com o nome de todos os .csv que começam com 'arquivo' array_df = [] for x in arquivos: temp_df…

python pandas csv
answered 5 years, 7 months ago Terry 889
9
votes

6
answers

19845
views

A: How to change the name of the pandas dataframe column?

According to the documentation of command rename, return of this function is a new Dataframe with the renamed column(s) (s). To get to the desired answer, (1) simply assign the function return to a…

python pandas csv
answered 5 years, 7 months ago Terry 889
1
votes

1
answer

370
views

A: Group by com Python[Nympy or Pandas] - Bring the 1st line and last line by date

The function first belongs to pandas, and not to numpy. Just change your np.first for "first"(as string) that will work :) df2 = df[["DATA","MAXIMA","MINIMA"]] df2['maxDia'] =…

python pandas numpy group-by
answered 5 years, 8 months ago Terry 889
1
votes

2
answers

351
views

A: Compare Dataframes and show different information between them

If you need to select different values between columns at the same index position, it can be done using (1) .loc and (2) taking the values 'not equal' with the command .ne…

python pandas
answered 5 years, 9 months ago Terry 889
1
votes

1
answer

138
views

A: Pandas: Acceptance and rejection percentage

You can do this by creating 2 temporary Dataframes using functions such as groupby, value_counts and pivot, one of them will have the total requisitions per city, and another will have the amount of…

python pandas csv
answered 5 years, 11 months ago Terry 889
0
votes

1
answer

234
views

A: How does "parse" work for handling dates in Python?

I like (personal opinion) to parse after loading the data for two reasons: (1) Make the code more readable and (2) with the command pd.to_datetime it is possible to handle errors that may occur…

python python-3.x pandas
answered 6 years ago Terry 889
0
votes

2
answers

434
views

A: Get the maximum value of each row in a grouped pandas dataframe

With groupby and transform it is possible to select the highest value of each class per state df = dfAcidentesPorMunicipiosPorUF.copy() df.loc[df['Total'] ==…

python pandas
answered 6 years ago Terry 889
0
votes

1
answer

746
views

A: Add and subtract according to a criterion in another column

If I understand you, you can do it by adding up the whole column Valor and subtract by twice the sum where Código Rubrica is equal to 352. sub352 = f0519_grouped.loc[f0519_grouped['Código Rubrica']…

python pandas
answered 6 years ago Terry 889
1
votes

3
answers

1079
views

A: transform list items into separate columns or extend dataframe to the end

A better (faster) alternative would be to create a new Dataframe by converting the column Inventory for an array numpy with value, thus: df = pd.DataFrame(idf["Inventory"].values.tolist()) df.index…

python pandas
answered 6 years ago Terry 889
0
votes

1
answer

112
views

A: How do I delete lines by cell specific content in pandas

You can do it using groupby with count df = df.loc[df.groupby('role_name')['role_name'].transform('count') >= 100]

python pandas machine-learning
answered 6 years, 1 month ago Terry 889
2
votes

1
answer

1184
views

A: Manipulating 3 GB Dataset with Pandas using Chunks

The idea of chunksize is that you can work on the data in 'blocks', using some of the existing loop systems. My tip is you pre-define your goals before reading the data using Chunk, since it…

python-3.x pandas
answered 6 years, 1 month ago Terry 889
2
votes

1
answer

154
views

A: Validate date as holiday or not

You can do it with the following sequence: Merge Dataframes with merge passing the command Indicator = True Check which lines were in the 2 Dataframes with np.where Delete extra column created by…

python pandas
answered 6 years, 1 month ago Terry 889
1
votes

2
answers

173
views

A: Reading of multiple datasets

This can be done using the library glob import glob arquivos = glob.glob('dataset/*.csv') # 'arquivos' agora é um array com o nome de todos os .csv existentes na pasta 'dataset' array_df = [] for x…

python pandas
answered 6 years, 2 months ago Terry 889
1
votes

3
answers

2125
views

A: Format Time in a Python data frame

You can convert the column Original Time for datetime using the function (1)to_datetime and extracting only the hour, minute and second with (2)strftime df['somente Horas'] = pd.to_datetime(df['Hora…

python pandas
answered 6 years, 2 months ago Terry 889
5
votes

3
answers

16188
views

A: Removing lines from a dataframe that meet a certain condition

To contribute to the thread, I suggest a solution using a mask to select the desired data, follow the performance tests: Using Loc and drop %%timeit df_remove =…

python-3.x pandas
answered 6 years, 2 months ago Terry 889
1
votes

2
answers

6875
views

A: Dataframe - Pandas. Assigning values in columns from comparing another column

You can solve this using the function select of numpy, passing an array of conditions, a result array and a value to default condicao = [df['return_percentagem'] < 50, df['return_percentagem']…

python pandas
answered 6 years, 2 months ago Terry 889
1
votes

1
answer

42
views

A: Mathematical operation with CSV files

It is complicated to suggest a solution when only an image of an excel is available and not a real sample of the data. But if what you’re looking for is multiplication between columns and saving it…

csv pandas numpy
answered 6 years, 2 months ago Terry 889
0
votes

2
answers

577
views

A: Group three commands into one

You can reduce the 3 commands in a single line by joining the first and third conditions within a (1)loc using (2)groupby with (3)transform, thus: f0219.loc[(f0219.Tiporubrica == 2) &…

python pandas group-by
answered 6 years, 3 months ago Terry 889
0
votes

2
answers

3465
views

A: How to invert the order of columns of a Dataframe with Python

A very generic way of reversing the order of the columns is selected the columns back to front inside the loc df.loc[:,::-1]

python date pandas
answered 6 years, 3 months ago Terry 889
0
votes

2
answers

105
views

Q: Transition in CSS sequentially

I’m developing a quiz where Divs transactions are done via CSS. The problem is that the transaction to add the next Div is running parallel to the withdrawal of the current one. I would like only at…

javascript jquery css
asked 9 years ago Terry 889
1
votes

1
answer

886
views

Q: Create a list of python Dict

I have the following python function def playersID(self, listDetals): listPlayersID = [] tempDict = {} for x in listDetals: for y in x['result']['players']: tempDict.clear() tempDict['match_id'] =…

python list dictionary append
asked 9 years, 11 months ago Terry 889