Posts by AlexCiuffa • 2,402 points

102 posts

1
votes

3
answers

84
views

A: Dataframe Pandas - How to use a previous value other than NA for calculation

Use the function diff() to calculate the difference between lines. How NaN should be ignored, just remove them first with df[~df['C'].isnull()]: >>> df[~df['C'].isnull()]['C'].diff() 0 NaN…

python pandas numpy
answered 4 years, 3 months ago AlexCiuffa 2,402
3
votes

1
answer

51
views

A: How to transform the subtraction field between dates in python into int?

The column Days_Considered is from guy dtype: timedelta64[ns]. To access the value of days as an integer, use .dt.days: df2['Days_Considered'] = (df2['End_Date'] - df2['Start_Date']).dt.days…

python pandas
answered 4 years, 4 months ago AlexCiuffa 2,402
0
votes

1
answer

715
views

A: Error Typeerror: __init__() Missing 1 required positional argument: 'output_dim' when instantiating convulutional neural network object

Error points to line: ---> self.embedding = layers.Embedding(vocab_size, emb_dim = 128) Checking out the Keras documentation on Embedding layer, we can see that this layer receives as mandatory…

python classes
answered 4 years, 8 months ago AlexCiuffa 2,402
0
votes

0
answers

22
views

Q: Post Request does not pass `getParams()`meters with Volley’s Stringrequest

When I make a POST Request according to my function postRequest() using the StringRequest of Volley I cannot access the parameters passed on getParams(). However, the method getRequest() works as…

android post volley
asked 4 years, 8 months ago AlexCiuffa 2,402
1
votes

1
answer

874
views

A: Select the top 10 values of a dataframe variable in python?

Pandas has the function .nlargest(). This function takes as parameters: n (int): Number of rows to return columns (list or str): Column(s) used in sorting keep (‘first’, ‘last’ or False): Decides…

python
answered 4 years, 8 months ago AlexCiuffa 2,402
3
votes

1
answer

1112
views

A: Count number of unique records in a Data Frame

The pandas has the function .nunique(), which returns the amount of unique values in a Data Frame or a Series. In your case, just do: df['customer_id'].nunique()…

python pandas
answered 4 years, 8 months ago AlexCiuffa 2,402
0
votes

1
answer

141
views

A: Sklearn - Difference between preprocessing.Scale() and preprocessing.Standardscaler()

1) What is the difference when using preprocessing.scale() Vs preprocessing.StandardScaler()? Both the StandardScaler().fit(X_train) and the scale(X_train) perform the same operation, but the first…

machine-learning sklearn
answered 4 years, 10 months ago AlexCiuffa 2,402
2
votes

1
answer

67
views

A: Tensorflow Convolucional Neural Networks

The function .fit() has as a standard a batch_size=32. This means that for a weight upgrade (backpropagation), 32 images will be used at a time. As your dataset has 649 images, will be realziados 21…

python tensorflow
answered 5 years, 2 months ago AlexCiuffa 2,402
0
votes

2
answers

85
views

A: sklearn library accuracy_score error

In doing accuracy_score(teste_y,teste_x), you are comparing the expected output (teste_y) with the entry of the model (teste_x). It is right to compare the expected output with the model response:…

python machine-learning sklearn
answered 5 years, 3 months ago AlexCiuffa 2,402
1
votes

1
answer

164
views

A: Transform column with Nan and string to integer

The pandas won’t turn np.NaN in int, because he considers it a float. But he can turn into Int64 (or Int16 and Int32). The NaN is transformed into <NA> (pd.NA), which is the null for integers,…

python python-3.x pandas numpy
answered 5 years, 3 months ago AlexCiuffa 2,402
5
votes

3
answers

289
views

A: How to print a Python operation by adding zero to the left?

If you want the print() double-digit output, for example 02, you can use the function .format(): >>> c = 2 >>> print('{:02}'.format(c)) 02 The problem scored by @Jeanextreme002 is…

python
answered 5 years, 4 months ago AlexCiuffa 2,402
4
votes

1
answer

95
views

A: Get equivalent expression to list(zip(list, Heights)) using the map() function

The mistake is in the zip(x,y). What the function zip() makes is, for each element of the list x , joins with the corresponding element at the position of the list y in a tuple and adds it to a…

python python-3.x functional-programming
answered 5 years, 4 months ago AlexCiuffa 2,402
1
votes

1
answer

364
views

A: Add midline with Seaborn - python

There is a function of matplotlib which inserts a vertical line into the graph, is the .axvline(x=0, ymin=0, ymax=1, **kwargs), that has the documentation here or here. In your case, just calculate…

python histogram graphics seaborn
answered 5 years, 4 months ago AlexCiuffa 2,402
0
votes

1
answer

38
views

A: Is it possible to use K-Means (or another Clusterization method) with point limits?

Maybe you should change your approach. If the goal is to have "values that are not close enough to achieve a 'vacancy' in the cluster", a density Clusterization approach seems more appropriate. I…

python k-means
answered 5 years, 4 months ago AlexCiuffa 2,402
0
votes

1
answer

111
views

A: Is there a relationship between data science and data analysis?

Although the necessary knowledge and tools used by both are very similar (statistics, mathematics, programming and business knowledge), they are not the same thing. Data analysis is to process the…

terminology
answered 5 years, 5 months ago AlexCiuffa 2,402
2
votes

1
answer

197
views

A: I have a neural net, and now?

Once trained your model, it is possible to save it with the module pickle. To save a modelo already trained, just do: import pickle filename = 'modelo_final.pkl' with open(filename, 'wb') as file:…

python python-3.x
answered 5 years, 6 months ago AlexCiuffa 2,402
2
votes

2
answers

53
views

A: Regression curve with x-axis equal to 0

The regressor LinearRegression() sklearn has the attribute intercept_, that returns the y where the regressor intercepts the Y axis, that is, in x = 0. Including, in the example of the documentation…

python artificial-intelligence regression
answered 5 years, 6 months ago AlexCiuffa 2,402
6
votes

1
answer

572
views

A: Why should we scale/standardize values of variables and how to reverse this transformation?

Should I schedule my entries? The answer is: depends. The truth is that scheduling your data will not worsen the result, so in doubt, scale. Cases to be staggered If the model is based on the…

r
answered 5 years, 6 months ago AlexCiuffa 2,402
2
votes

3
answers

2843
views

A: How does train_test_split work in Scikit Learn?

Why data needs to be divided? An ML algorithm is expected to learn from the training set, but then how do we know if the model is working? If it works with new data? How we compare with other…

python machine-learning
answered 5 years, 6 months ago AlexCiuffa 2,402
3
votes

3
answers

2524
views

A: Divide date (day, month, year) into new columns - Dataframe Pandas

This error happens because your column data is not the type str, and yes of the type datetime64. To see the column types of your Data Frame, just do >>> df.dtypes data datetime64[ns]…

python pandas
answered 5 years, 6 months ago AlexCiuffa 2,402
3
votes

1
answer

101
views

A: Data similarity with various pandas values

One approach is to calculate the distance of a new data and the data from the Data Frame by calculating the dissimilarity. For this, I suggest using the Distance from Gower. It works as follows:…

python pandas
answered 5 years, 7 months ago AlexCiuffa 2,402
5
votes

1
answer

1046
views

A: How to filter lines that have a certain string?

The solution is to use the function .contains. dados[dados.Value.str.contains("disease", regex=False)] It is worth noting that this function assumes that the string passed is a regular expression,…

python pandas
answered 5 years, 7 months ago AlexCiuffa 2,402
2
votes

1
answer

213
views

A: What is Validation_data for in Keras fit() function

The validation_data It is only used for the test samples, it is not used to train the model. That is, it is not done backpropagation with this data. Its main function is to help find the point where…

python keras
answered 5 years, 7 months ago AlexCiuffa 2,402
1
votes

2
answers

1413
views

A: How to turn list into array

Given a Data Frame, for example: import pandas as pd import numpy as np df = pd.DataFrame(data = {'listas':[[1,2],[3,4],[5,6]],'nomes':['nome 1','nome 2','nome 3']}) >>> df listas nomes 0…

python numpy
answered 5 years, 8 months ago AlexCiuffa 2,402
0
votes

1
answer

37
views

A: Passing a list like loss_weights, it should have one input per model output. Keras tells me that the model has 1 output, but I thought I had more

Confusion is in the parameter loss_weights of compile with the class_weight of fit. Error says that the model has only one output because only a loss fução was passed:…

python python-3.x keras
answered 5 years, 8 months ago AlexCiuffa 2,402
2
votes

1
answer

401
views

A: How to use a quadratic regression model?

Using only the statsmodels: With the statsmodels it is possible to write the desired formula, for example: target ~ np.power(X1, 2) + X2 In this example, it means that we are searching for the…

python regression sklearn
answered 5 years, 8 months ago AlexCiuffa 2,402
2
votes

1
answer

221
views

A: Set steps_per_epoch is dramatically increasing training time

The (default) default of .fit of Keras is: batch_size: if not specified, assumes 32; steps_per_epoch: number of samples (samples) divided by batch size. In the first case, it does 78800 amostras /…

python tensorflow keras
answered 6 years ago AlexCiuffa 2,402
0
votes

1
answer

638
views

A: How to change only one color in Pillow (Python)

Basing myself in this answer of the OS in English, basically you need to open the image, move to a numpy.array, take the RGB channels, set a condition (e.g., where the color is white) and replace…

python image
answered 6 years, 1 month ago AlexCiuffa 2,402
1
votes

1
answer

540
views

A: Separating a dataframe by some python pandas criterio

One way is to take the index of the positive lines, select only 51 values, join with the index of the negative lines and keep only the selected lines: # Pego os ids das linhas com estrelas positovas…

python-3.x pandas ipython-notebook
answered 6 years, 1 month ago AlexCiuffa 2,402
0
votes

1
answer

399
views

A: Adding column with filter in Pandas

The first step is to filter the Dataframe where vlrLiquido is "CASCOL COMBUSTIVEIS PARA VEICULOS LTDA": df[df['txtDescricaoEspecificacao'] == 'CASCOL COMBUSTIVEIS PARA VEICULOS LTDA'] Then we take…

pandas
answered 6 years, 1 month ago AlexCiuffa 2,402
0
votes

1
answer

194
views

A: How to activate 2 if’s at the same time in Python

To press 2 buttons together, just do pyautogui.hotkey('1', '2'). This causes it to press in order and release in reverse order as described here. Now, just take what colors are active and move on to…

python-3.x if pyautogui
answered 6 years, 1 month ago AlexCiuffa 2,402
4
votes

1
answer

2027
views

A: Python - Run two scripts at once

A Python library for running parallel processes is threading. Basically, it is possible to declare and execute a process like this: # Define um processo a partir de uma `função_a_ser_executada(arg1,…

python python-3.x
answered 6 years, 1 month ago AlexCiuffa 2,402
2
votes

1
answer

487
views

A: Concatenating equal data from a column into rows in Python

One way is to: df.groupby('Categoria').agg({'Descricao':lambda col: ', '.join(col)}).reset_index() This way you are grouping the same data from the column Categoria and using for the column…

python pandas
answered 6 years, 1 month ago AlexCiuffa 2,402
1
votes

1
answer

61
views

A: Doubt in models Predict

First, the error occurs in treino, teste, classe_treino, classe_teste = train_test_split(vetor_tfidf2, resenha["classificacao"], random_state = 42) This is because vetor_tfidf2 has only 1 item, and…

python pandas machine-learning natural-language
answered 6 years, 1 month ago AlexCiuffa 2,402
1
votes

3
answers

69
views

A: Transform a list into several within the same

With list comprehension just do: nova_lista = [[i] for i in X] Using numpy, we can change the shape of array: import numpy X = [1,2,3,4,5] nova_lista = numpy.array(X).reshape(len(X),1)…

python list
answered 6 years, 2 months ago AlexCiuffa 2,402
0
votes

1
answer

41
views

A: Linearregression score

From the very example of sklearn.linear_model.LogisticRegression: >>> from sklearn.datasets import load_iris >>> from sklearn.linear_model import LogisticRegression >>> X,…

python
answered 6 years, 2 months ago AlexCiuffa 2,402
0
votes

1
answer

665
views

A: Nameerror: name 'mostrar_urls' is not defined

One way is to specify the function you want to use from the file arquivo_exemplo thus: from arquivo_exemplo import função_que_quero_chamar. In this case, when calling the function, you do not need…

python python-3.x function tkinter
answered 6 years, 2 months ago AlexCiuffa 2,402
1
votes

2
answers

6875
views

A: Dataframe - Pandas. Assigning values in columns from comparing another column

On pandas, when we do: df['coluna'] = 'valor', all fields of 'coluna' are filled with 'valor'. So much so that in doing df['classification_roi'] = '', all rows in the column classification_roi…

python pandas
answered 6 years, 2 months ago AlexCiuffa 2,402
0
votes

3
answers

46
views

A: error in creating subplots

The subplot Python takes as parameters the number of rows and columns to draw the grid with the graphs. For its code, fig_2, axes = plt.subplots(2,2, figsize=(20, 5)), the grid will have 4 graphics…

pandas matplotlib
answered 6 years, 2 months ago AlexCiuffa 2,402
3
votes

1
answer

73
views

A: Single values average filter in dictionary list

Using pandas, a Python library that works with Dataframes, it is possible to solve this problem easily. import pandas as pd a = [ {'linha': 0, 'porcentagem': 1.0, 'id': 3, 'nome': 'bruno'},…

python python-3.x list filter dictionary
answered 6 years, 2 months ago AlexCiuffa 2,402
1
votes

2
answers

377
views

A: Multiply elements of the first matrix by 3 and decrease elements of the second matrix by 3 in Python

Using numpy operations can be performed directly with arrays and numbers: import numpy as np mt1 = input().split() # Matriz 1 mt1 = np.array(mt1, dtype='int') # passa o input para um numpy.array mt1…

python
answered 6 years, 2 months ago AlexCiuffa 2,402
0
votes

1
answer

466
views

A: A = A.astype('double') Attributeerror: 'list' Object has no attribute 'astype'

The mistake says, Like list has no attribute astype. I think the variable A should be the numpy.ndarray, and not the type list: A = np.array(A) # passa a lista para um numpy.ndarray A =…

python attributes
answered 6 years, 2 months ago AlexCiuffa 2,402
1
votes

2
answers

3879
views

A: How to separate the year from a date with Python and Pandas?

First step to the column DT_INGRESSO for type datetime (and no longer string). Note that dates are in the format dd/mm/aaaa, then its shape is %d/%m/%Y: df['DT_INGRESSO'] =…

python date date pandas split
answered 6 years, 2 months ago AlexCiuffa 2,402
2
votes

1
answer

1848
views

A: Recognize the color of a python image

One of the Python libraries for working with image is Pillow. With it, you can pick up the colors that appear in an image. from PIL import Image # Abre a imagem img =…

python colors
answered 6 years, 2 months ago AlexCiuffa 2,402
2
votes

1
answer

41
views

A: It’s not stacking graphic

I think the kind of chart you want is a histogram, where the number of ages per band will be shown. idades = [19, 20, 44, 55, 77, 88, 39] #faixaEtaria=["0-20","21-30","30-45","45-90"]…

python
answered 6 years, 2 months ago AlexCiuffa 2,402
0
votes

3
answers

2910
views

A: How to delete the first line in a python CSV file

With Python alone, it is possible to read like this: import csv # Lê um CSV com o cabeçalho with open("arquivo.csv") as f: reader = csv.reader(f) #next(reader) # skip header data = [r for r in…

python pandas
answered 6 years, 2 months ago AlexCiuffa 2,402
0
votes

1
answer

292
views

A: how to get the minimum average and maximum time of a timeseries with pandas, based on a column where the value is boleano?

From that your column timestamp are dates. If they are not, you can transform them like this: df['timestamp'] = pd.to_datetime(df['timestamp'], format='%Y-%m-%d %H:%M:%S') One solution is to create…

python-3.x pandas timestamp
answered 6 years, 2 months ago AlexCiuffa 2,402
0
votes

0
answers

52
views

Q: Calculate the weight of each class of an unbalanced multi-label dataset

I would like to calculate the weight of each class of a dataset multi-label to pass to fit_generator of Keras the parameter class_weight. In the case of a dataset single label, as my output is…

python python-3.x machine-learning
asked 6 years, 3 months ago AlexCiuffa 2,402
3
votes

1
answer

6569
views

A: How to join the lines of two Dataframes with Python?

There is the function .concat() of pandas that concatenates two dataframes, but to use it, the columns must have the same name. So we can rename them with the .rename(), another function of pandas.…

python date pandas
answered 6 years, 3 months ago AlexCiuffa 2,402
2
votes

1
answer

169
views

A: Take the probability of belonging to each class

To know the percentage of belonging to each class, use the function .predict_proba(). She is similar to .predict(), but returns the probabilities to belong to each class in the form of an array.…

python machine-learning sklearn
answered 6 years, 3 months ago AlexCiuffa 2,402