Posts by lmonferrari • 3,550 points
179 posts
-
0
votes1
answer44
viewsA: PANDAS - PYTHON - FIND DIFFERENT VALUES
You can use the isin of pandas Importing the pandas import pandas as pd Creating the dataframes: Novo_Mailing_df = pd.read_csv('../DADOS/Novo_Mailing.csv', sep = ';', names=['Coluna1'])…
-
0
votes1
answer25
viewsA: Python - append in the same Dataframe
You can use the pandas append, in which case a new data frame was created to go adding the results to it: import pandas as pd import requests header = { "User-Agent": "Mozilla/5.0 (X11; Linux…
-
0
votes1
answer31
viewsA: Difficulty Merging Sequential Columns in a Dataframe with Pandas
You can use the Concat of pandas, passing data frames and column axis: parte3 = pd.concat([parte1, parte2], axis=1)
-
4
votes3
answers54
viewsA: Replacing NA values of a column by the value of the top row of the same column of a dataframe
An alternative would be to use Fill package tidyr: library(tidyr) DADOS <- data.frame( a = c(1, 2, NA, 3, 4), b = c(5, 6, 7, NA, NA) ) DADOS %>% fill(a,b) Exit: a b 1 1 5 2 2 6 3 2 7 4 3 7 5 4…
ranswered lmonferrari 3,550 -
0
votes1
answer31
viewsA: Create dataframe pandas 1 key and some non-standard values in the dictionary
import pandas as pd dicionario = {0:[['tela1'],['tela2'],['tela3']], 1:[['tela2']], 2:[['tela5'],['tela7']], 4:[['tela1'],['tela3']]} df = pd.DataFrame.from_dict(dicionario, orient='index') df =…
-
3
votes2
answers77
viewsA: In a Dataframe, modify data from one column conditioned to the value of another column
In addition to the @Augusto Vasques suggestion, you can use Oc as you previously tried: df.loc[df['Side'] == 'BUY', 'Amount'] = -df['Amount'] Loc + isin df.loc[df['Side'].isin(['BUY']), 'Amount'] =…
-
2
votes1
answer66
viewsA: How to save the generated graphics files (in png) within a loop
Add the png line with the location where you want to save(you have to have write permission so change as in the example below to your image folder) At the end of your chart generation add dev.off()…
-
2
votes2
answers43
viewsA: move data left in pandas
Follow a possible solution by slicing, then using shift to move the columns import pandas as pd import numpy as np tabelas =…
-
1
votes1
answer41
viewsA: Sum pandas columns by row and selecting comparative by Qgrid row
Importing the libs import pandas as pd import seaborn as srn import statistics as sts Loading the data dataset = pd.read_excel('/content/drive/MyDrive/Data science /BRA 2020.xlsx') Excluding the…
-
0
votes2
answers185
viewsA: Beautifulsoup: Catch text inside table
As stated in the other answer, a template is loaded to be fed, so requests cannot get the correct values. Using requests_html Importing the lib from requests_html import HTMLSession Creating the…
-
0
votes1
answer33
viewsA: Separate a Dataframe
Maybe the groupby and the pct_change pandas help you df['pct_change_new_confirmed'] = df.groupby('state')['new_confirmed'].pct_change().fillna(0) df['pct_change_new_deaths'] =…
-
4
votes4
answers1234
viewsA: Python - Calculate transposed matrix
Source matrix M =[[1,2],[3,4],[5,6]] Printing for j in M: print(j) Exit [1, 2] [3, 4] [5, 6] Transposed Creating the matrix M_t = list(map(list, zip(*M))) Printing for j in M_t: print(j) Exit [1, 3,…
-
1
votes1
answer51
viewsA: Python Dataframe dynamically
I believe you can create a 'temporary' data frame, make the predictions and save in an xlsx for example: for cod in dados['Codproduto'].unique(): df_temp =…
-
2
votes2
answers63
viewsA: How to add a new column with the group average in pandas?
Importing the pandas import pandas as pd Reading the dataset dataset = pd.read_excel('./BRA.xlsx') Removing unnecessary columns dataset.drop(columns=['League','Country','Time','Date'], inplace=True)…
-
3
votes2
answers35
viewsA: How to export [[ ]] from a list in separate excel files using R?
... library(xlsx) producao_per_discente<-split(producao, producao$discente) lapply(seq_along(producao_per_discente), function(i) write.xlsx(producao_per_discente[[i]], file =…
-
0
votes1
answer42
viewsA: How To Order In R
You can use the Sort dados = c(8,6,5,4,1,3,7) dados = sort(dados, decreasing = F) dados Exit 1 3 4 5 6 7 8
ranswered lmonferrari 3,550 -
0
votes1
answer55
viewsA: Overwrite all data from a column of a data frame in R
Using this data ae <- c(1,2,3,4,5,6,7,8,9,10) be <- c(10,9,8,7,6,5,4,3,2,1) pnadc1 <- data.frame(ae,be) You can reassign values this way pnadc1$ae <- 200 Exit ae be 1 200 10 2 200 9 3…
rstudioanswered lmonferrari 3,550 -
0
votes1
answer37
viewsA: Join two columns with different dates
... df_sp1 = DataReader('^GSPC', data_source='yahoo', start='2020-1-1', end="2020-02-04") df_sp2 = DataReader('^GSPC', data_source='yahoo', start='2021-1-1') You can use the Concat novo_df =…
-
1
votes2
answers97
viewsA: How do I filter a Dataframe row by knowing a String value from one of its columns?
Creating Data Frame Test import pandas as pd codigos = ['cod1','mxrf11','cod2','mxrf11','cod3','mxrf11'] valores = ['teste1','teste2','teste3','teste4','teste5','teste6'] df =…
-
1
votes1
answer88
viewsA: transform data from a Dataframe column into a single string
You can use the to_string df['Coluna'].to_string() import pandas as pd palavras = ['ola','como','vai','você?'] dados = pd.DataFrame({'Texto': palavras}) dados Dice Texto 0 olá 1 como 2 vai 3 você?…
-
1
votes2
answers62
viewsA: Perform a previous values calculation in column on R
Maybe this will help you, using dplyr Data test frame dados <- data.frame(coluna_1 = c(558.8, 584.3, 603.3)) The logic library(dplyr) dados <- dados %>% mutate(coluna_2 = case_when(…
-
3
votes1
answer52
viewsA: How to split columns/data with a specific limit?
You can determine a chunksize pro value import pandas as pd # tamanho da fatia tamanho = 5000 for fatia in pd.read_csv('./arquivo.csv', chunksize = tamanho): # seu código aqui…
-
2
votes2
answers53
viewsA: How to change abbreviated values in a DF using Pandas in Python
In addition to the response of Lucas (that I even prefer), you can keep the basis of your code Creating the test data frame import pandas as pd dados = ["35,57B", "6,85T"] df =…
-
1
votes1
answer68
viewsA: Lapis design effect with opencv and python
import cv2 # abrindo a imagem em escala de cinza img_gray = cv2.imread('wonder-woman.png', cv2.IMREAD_GRAYSCALE) # calculando o inverso, 255 é branco 0 é preto e aplicando o blur img_gray_inv = 255…
-
0
votes2
answers81
viewsA: format data from a column in a data.frame in R
An example using the dplyr valores_1 <- c('24','25','34','234','0045', '1234') dados <- data.frame(Coluna1 = valores_1, stringsAsFactors = FALSE) library(dplyr) dados %>% mutate( Coluna1 =…
ranswered lmonferrari 3,550 -
2
votes2
answers55
viewsA: How to calculate the average for groups and identify the maximum value?
One way to return the values in an "ordered" way is to use the reset_index dfseason = df.groupby(by='Month', sort=True)['Billed'].sum().nlargest(1).reset_index() dfseason Exit Month Billed 0 May 918…
-
2
votes1
answer42
viewsA: Python: I need to get the y coordinate given to x coordinate
import matplotlib.pyplot as plt from numpy import polyfit Defining known X and Y X = [0, 5] Y = [2, 4] By calculating the coefficients m, b = polyfit(X, Y, deg=1) New x for calculation x = 2.5 Line…
-
1
votes3
answers64
viewsA: optimize Camelot large pdf files
Maybe this will give some optimized import camelot, PyPDF2, tqdm import pandas as pd from tkinter import Tk, filedialog as dlg Tk().withdraw() file_path = dlg.askopenfilename() last_page =…
-
3
votes2
answers260
viewsA: In Python, how do you remove specific characters from all the records of just one particular column?
You can use apply and slicing the string raw_data['nome_arquivo'] = raw_data['nome_arquivo'].apply(lambda x: x[:-4]) You can also use replace raw_data['nome_arquivo'] =…
-
0
votes1
answer33
viewsA: In Python E Jupyter Notebook, how to present a full screen record?
According to the documentation you can set this using max_colwidth. pd.set_option("max_colwidth", 40) 0 1 2 3 0 foo bar bim uncomfortably long string 1 horse cow banana apple Setting a lower value…
-
1
votes1
answer41
viewsA: Pivoting in Pandas
See if this way works for you df.pivot_table(index = 'ID_PACIENTE', columns = 'DE_ANALITO', values = 'DE_RESULTADO', aggfunc = ''.join).reset_index().rename_axis(None, axis = 1) or df.pivot(index =…
-
1
votes2
answers64
viewsA: Select columns from a base without having to read the whole file
dados19 <- read.csv('./SUP_ALUNO_2019.CSV', sep = '|', dec = '.', colClasses = c('NULL','NULL','NULL','NULL','integer', 'NULL','NULL','NULL','NULL','NULL', 'NULL','NULL','NULL','NULL','integer',…
ranswered lmonferrari 3,550 -
0
votes3
answers125
viewsA: How to add one column of data based on another in excel through Pandas?
Importing the libs import pandas as pd import numpy as np Creating the test df conteudo = ['1 X 40 CONTAINERS 40 BAGS OF FLUTRIAFOL TECNICO SINON FLUTRIAFOL 97% TECH', '1 X 20 CONTAINERS 20 BAGS OF…
-
1
votes1
answer75
viewsA: NLP Text sorting using Python
You can work with a library that makes a string Fuzzy. String fuzzy is used to find similarities in strings even if there is some typing error. Fuzzywuzzy works with Levenshtein distance to…
-
3
votes1
answer71
viewsA: Valueerror error: 1 Columns passed, passed data had 12 Columns
The first error for your case occurs here, where you pass a 'list' list' conteudo2 = [['Pontos Ganhos','Vitórias','Empates','Derrotas','Saldo de Gols','Gols Pró','Gols Contra','Chance de…
-
0
votes1
answer85
viewsA: Python/Pandas: Treatment of TXT
A possible solution using pandas import pandas as pd # carregando os dados e atribuindo nomes as colunas colunas = ['Data','id_m','id_c','Data inicial','Data…
-
0
votes1
answer133
viewsA: Abstract class 'Excelwriter' with Abstract methods instantiatedpylint(Abstract-class-instantiated)
As the documentation says, when you want to save more than one sheet in the same file you need to declare the Excelwriter object with pd.ExcelWriter('teste.xlsx') as writer: df1.to_excel(writer,…
-
0
votes2
answers105
viewsA: Join lines from a Python column
An alternative using pandas groupby import pandas as pd import numpy as np Creating the test data frame dados = pd.DataFrame({'Dados':np.random.randint(1,100, 43184)}) Calculating the average with…
pythonanswered lmonferrari 3,550 -
0
votes1
answer26
viewsA: return more recent files to a folder
from pathlib import Path import pandas as pd directory = Path('./') files = list(directory.rglob('*.*')) raw_data = [[item.name,item.stat().st_mtime] for item in files] df = pd.DataFrame(raw_data,…
-
1
votes1
answer62
viewsA: Removing Symbols in Python dataframe columns
Data Test Frame import pandas as pd import re titulo = ['[Cobra Kai]', '[Bridgerton]', '[Vikings]'] genero = ['[\nAction, Comedy, Drama]', '[\nDrama, Romance]','[\nAction,Adventure, Drama]'] ano =…
replaceanswered lmonferrari 3,550 -
1
votes1
answer433
viewsA: Python Pandas: Dataframe convert Timestamp column to Datetime
import pandas as pd data = [1610323200000,1610409600000,1610409600000,1610496000000,1610582400000] volume = [38150.02,35410.37,34049.15,37371.38,39145.21] abertura =…
-
2
votes1
answer32
viewsA: Creating a new column using for
Dice df_vot = pd.read_csv('./Dados aula04/votacao_partido_munzona_2020_BRASIL.csv', sep = ';', encoding = 'latin1') centro = ['AVANTE', 'MDB', 'PROS', 'PSDB', 'SOLIDARIEDADE'] direita = ['DC',…
-
2
votes1
answer31
viewsA: Extract only uppercase words with R
You can check if there is more than one occurrence of uppercase letter within the word limit str_extract_all(teste ,'\\b[A-Z]+\\b') or str_extract_all(teste, "\\b[:upper:]+\\b")…
-
2
votes1
answer88
viewsA: Error printing HTML with Beautifulsoup
import requests from bs4 import BeautifulSoup as bs headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)'} url =…
-
1
votes1
answer30
viewsA: How to edit values in a column?
Maybe this will help you library(lubridate) df$dia <- day(df$date) df$ano_mes <- paste0(year(date),'-',month(date)) Using the lubridate package we were able to extract the day, month and year.…
-
5
votes1
answer90
viewsA: How to use Beautifulsoup’s "find" to find a script tag with a specific type?
import requests from bs4 import BeautifulSoup as bs def get_cod_produto(url): response = requests.get(url) data = response.text soup = bs(data, 'html.parser') return soup.find('script',…
-
4
votes1
answer62
viewsA: Make a graph in R (ggplot) similar to Excel bar charts
library(dplyr) library(tidyr) library(ggplot2) df <- read.csv2('./Tabela_areas_referencias_porcent_2.csv') df_pivoted <- pivot_longer( data = df, cols = c("Vegetação_Nativa",…
-
5
votes2
answers70
viewsA: Error plotting with ggplot
An example creating the sequence of dates with n equal to zero library(ggplot2) library(dplyr) df <-data.frame( ano = c(2007, 2008, 2017, 2018), n = c(1, 2, 2, 1) ) anos <- data.frame(ano =…
-
3
votes1
answer53
viewsA: How to delete null lines in a Dataframe?
You can return those that have no missing values this way: df[~df.isnull()] ~ serves to deny, ie is null turns into a kind of not null.
-
1
votes1
answer69
viewsA: Compare the information of two data.frames (tables) to create groups and a third column. in R
You can use the between by comparing the data frame indexes in the case: library(dplyr) rows <- rownames(CBO2002) CBO2002 <- CBO2002 %>% mutate(grupo = case_when( between(rows,1,28) ~…
ranswered lmonferrari 3,550