Posts by Daniel Falbel • 12,504 points
268 posts
-
2
votes3
answers955
viewsA: How to place multiple formats on the dots of my PCA by ggplot2
If you set the PCA the following way: ir.pca <- prcomp(iris[,1:4]) You can get the values of each of the components by doing: ir.pca$x Therefore, to plot with ggplot, I would do so:…
-
4
votes4
answers1143
viewsA: Add rownames as column using dplyr
There is a function called rownames_to_column package tibble that allows you to do this: mtcars %>% rownames_to_column()
-
2
votes2
answers133
viewsA: Direct (and beautiful) solution to fix base using dplyr
I couldn’t think of a way to do it in just one expression. But I don’t think the following organization is bad. motivo <- df %>% select(starts_with("motivo")) %>% gather(key, motivo) %>%…
-
5
votes1
answer125
viewsA: Take the previous value with a condition on R
It is possible to do using dplyr with group_by and lag. For example: > base # A tibble: 17 × 4 CNPJ data chave x <dbl> <dbl> <dbl> <dbl> 1 77777 201602 77777 201602 2…
ranswered Daniel Falbel 12,504 -
4
votes1
answer511
viewsA: I need to create a column with the ln of a variable in the same data frame
Just do: bd <- data.frame(PIB = PIB) bd$lnPIB <- log(bd$PIP) Ready! Your data.frame called bd now has a column with the logarithm value of the GDP.…
-
1
votes1
answer255
viewsA: I cannot run a command with the "rDEA" package
Your code problem is probably in data import. The variable Y that you create needs to be a data.frame with numerical columns or a numerical matrix. In this case, as far as I can see from the image,…
-
1
votes1
answer142
viewsA: A: Naming vector within a for
That object models should be a list. You can find out by doing typeof(models). Even if he is one data.frame, it will also be a list (Example: typeof(mtcars)). If the object is a list, you can access…
ranswered Daniel Falbel 12,504 -
1
votes2
answers251
viewsA: Normality test for a sample of 12 to 15 thousand observations - R
You can use the Anderson-Darling Install the package nortest and rotate: library(nortest) ad.test(rnorm(5001)) # Anderson-Darling normality test # # data: rnorm(5001) # A = 0.2826, p-value = 0.6359…
ranswered Daniel Falbel 12,504 -
3
votes1
answer307
viewsA: Barplot (bar graph) of numerical versus categorical variable in R
I would do so: library(tidyr) library(dplyr) library(ggplot2) dados %>% gather(intensidade, qtd, -Y) %>% group_by(intensidade) %>% summarise(media = weighted.mean(Y, qtd)) %>%…
ranswered Daniel Falbel 12,504 -
2
votes1
answer70
viewsA: How do I exchange a apply inside a for for for a double apply?
One possible way is to generate all possible combinations and then apply the normal apply. Combinations can be generated using something like expand.grid or so using the package purrr. Consider…
-
1
votes1
answer78
viewsA: Function to read file via ODBC
You are not returning the file in the function... Add a line at the end of the function return(arquivo). When calling the function use dados <- importa.sql()…
-
2
votes1
answer116
viewsA: Save changes from a loaded app to shinyapps.io
One possible way is that every click on the button salvar you save all your data somewhere. That place might be on the S3 of Amazon, some database, etc. But it might not be the local disk. I…
-
2
votes1
answer83
viewsA: Change output in Shiny only when I change value in numericInput()
You can generate this numericInput on the server side, so you can determine which should be the initial value. Example: library('shiny') ui <- fluidPage( numericInput('line', 'Line Choice:',…
-
8
votes1
answer469
viewsQ: How to calculate the median when the data is in Chunks?
Suppose my data is divided into 2 vectors: x <- c(100, 400, 120) y <- c(500, 112) I could calculate the median by joining the two vectors and then using the function median. median(c(x,y)) [1]…
-
3
votes1
answer806
viewsA: Error: Object of type 'closure' is not subsettable
The functions in R work a little different than what you should be used to. In your case just change the first two linahs of the function Gauss for: B <- GeraMatriz(n,a,b,q) v_h <-…
ranswered Daniel Falbel 12,504 -
21
votes1
answer1658
viewsQ: How to document SQL code?
When I write R codes, the correct way to document is in the code itself, in the form of comments initiated with a special marking #'. #' Add together two numbers. #' #' @param x A number. #' @param…
-
3
votes1
answer650
viewsA: Interaction graph in ggplot2
Make programs where variables vary with the dplyr and with the ggplot can be very boring. Here’s a function that works for what you want: library(dplyr) library(ggplot2) library(lazyeval)…
-
5
votes2
answers1947
viewsA: Join multiple files from a folder in R
I was able to reproduce your error using a spreadsheet with empty column name. That’s probably your problem. To solve it I would do so: arquivos <- lapply(larquivos, function(x) { df <-…
ranswered Daniel Falbel 12,504 -
3
votes2
answers256
viewsA: Sort the k highest results using dplyr
Other ways to do the same thing: library(dplyr) mpg %>% arrange(desc(displ)) %>% slice(1:5) mpg %>% filter(row_number(desc(displ)) <= 5)
-
1
votes1
answer463
viewsA: hnp function - R hnp package
In the help the function hnp (pages 11 and 12 of this document), you can see the list of templates that can be used. Templates made with the package glmer are not on that list. By help, you can also…
-
1
votes1
answer780
viewsA: How to remove columns from a data frame?
A possible way to do this is by using the function select_if package dplyr. First set a function that counts the number of zeros: contar_zeros <- function(x){ sum(x == 0) } Now consider this…
ranswered Daniel Falbel 12,504 -
4
votes1
answer302
viewsA: how to increase the limit of ifelse in R?
One possible way is to use the case_when of dplyr. I did some tests and did not find the limitation of many cases as the ifelse has. In your case, would be so: case_when( v0211==1 & v0212==2 ~…
-
1
votes1
answer252
viewsA: How to make the graph start on the y-axis by ggplot?
To start from 0, just add the command ylim(0,NA). For example: ggplot(mtcars, aes(x = disp, y = mpg)) + geom_point() Returns this graph: Already, with this term added: ggplot(mtcars, aes(x = disp, y…
-
2
votes1
answer171
viewsA: Subset problem in R
To remove only the lines that are "Unclassified" and "Canceled", the easiest way is like this: Consider this data.frame: df <- data.frame( x = c("Cancelada", "Outro", "Outro", "Cancelada"), y =…
ranswered Daniel Falbel 12,504 -
3
votes1
answer210
viewsA: Plot the mean in R
With the ggplot2, you can do so: library(dplyr) library(ggplot2) df <- data_frame(x = rnorm(100), y = x + rnorm(100)) ggplot(df, aes(x = x, y = y)) + geom_point() + geom_hline(aes(yintercept =…
ranswered Daniel Falbel 12,504 -
3
votes1
answer66
viewsA: Why does rvest break when processing an empty file?
I will answer only the part: Why the mistake happens? When you read an empty file with the function read_html package xml2 using the code below: tf <- tempfile() file.create(tf) html_erro <-…
-
1
votes2
answers3494
viewsA: Second-degree Polynomial Regression in R: How to Obtain X given Y?
I don’t know any function ready to do this, however this problem can be treated as an optimization problem. We want to find x in a given range that will minimize a function. The function I intend to…
ranswered Daniel Falbel 12,504 -
1
votes1
answer64
viewsA: Is it possible to train a model when I only have one of the mapped classes?
You can turn the problem into binary if your assumption that among the unclassified ones the majority is false, as you say in your question. (Right, it would be not having any positive inside the…
machine-learninganswered Daniel Falbel 12,504 -
7
votes2
answers272
viewsA: In R, what is the best way to select sets of internal lists within a list of lists?
I find the following way more concise to do what you need: library(purrr) # para a função map library(tidyr) # para a função unnest library(dplyr) # para a função as_data_frame map(lista, ~map(.x,…
-
2
votes2
answers57
viewsA: use of ifelse with Matrix
The following code can be adapted to get the result you want: x <- matrix(rep(c(1, NA), 6), ncol = 2) y <- matrix(1:12, ncol = 2, nrow = 6) t(sapply(1:6, function(i, x, y) {…
ranswered Daniel Falbel 12,504 -
1
votes1
answer287
viewsA: How to create a variable by averaging another variable from the same dataset?
You can use the dplyr as follows: library(dplyr) df %>% group_by(country, period) %>% mutate(media = mean(pib)) Source: local data frame [12 x 5] Groups: country, period [6] country year…
ranswered Daniel Falbel 12,504 -
5
votes1
answer111
viewsQ: Concatenate a new line
How to add a new line by separating inputs into a Perl script? Example: I’m using the following script: #!/usr/bin/perl print 'oi' . '\n' . 'oi,de novo' And I call on the bash as follows: perl…
perlasked Daniel Falbel 12,504 -
1
votes2
answers228
viewsA: Filtering data from a vector
I know the question has already been resolved, but as I understand it, you needed to remove some products from your base before you boxplot. And the id of these products was saved in another vector.…
ranswered Daniel Falbel 12,504 -
0
votes2
answers239
viewsA: How to save an error message to a string in R?
One way that can be useful is to encapsulate its function wrapped in function safely package purrr. A function encapsulated in safely always returns a list with two elements: the result and the…
ranswered Daniel Falbel 12,504 -
6
votes2
answers2782
viewsA: Is there a difference between assigning value using '<-' or '=' in R?
Complementing Marcus' answer. An interesting point is precedence of those operators. The <- comes before =. What makes that: > a <- b = 1 Error in a <- b = 1 : não foi possível encontrar…
-
3
votes3
answers34320
viewsA: How to import excel data pro R?
Currently, the best way to import from excel is by using the package readxl. Unlike the package xlsx, it does not depend on the rJava, which makes it more portable and easy to install. You can…
-
1
votes1
answer42
viewsA: Changing list by reference in R
When you make a for in the form for(i in x), R makes a copy of the object x and does not change it in each iteration. A simple example is: > x <- list(a = 1, b =1) > for(i in x){ + i <-…
-
2
votes1
answer1261
viewsA: color gradient R
You can add a call to the function scale_colour_continous. For example: library(ggplot2) dados <- data.frame(x = runif(100), y = runif(100), ano = rep(2010:2014, each = 20)) ggplot(dados, aes(x =…
-
1
votes2
answers696
viewsA: Inconsistent numerical format
The way your data was imported, some columns got quotes in the name. This prevents the operator $ to work the way you expected. The best way to fix it is to re-import the base. But it is also…
ranswered Daniel Falbel 12,504 -
2
votes1
answer279
viewsA: Build Accumulated Density Probability using R
The cumulative distribution function, in its case, is a function of x. Therefore, there is not only a cumulative density function, but one for each possible x. With the following code you can view…
-
1
votes1
answer1347
viewsA: How to check if a column(variable) exists within a 'data.frame'?
Your problem is related to non-standard evaluation. The smallest possible example of doing this is: verificar_coluna <- function(data, coluna){ coluna_texto <- deparse(substitute(coluna))…
-
1
votes1
answer940
viewsQ: What is the correct way to import functions from other packages in python?
If I create a python package, I will define a file setup.py that has more or less this format: from setuptools import setup setup(name='funniest', version='0.1', description='The funniest joke in…
pythonasked Daniel Falbel 12,504 -
2
votes1
answer154
viewsA: Integer overflow in R
From what I understand, what the rmultinom do, it’s like this: You own lenght(prob) different types of balls, where prob is the paretero prob of function. Then you will withdraw size balls…
ranswered Daniel Falbel 12,504 -
3
votes1
answer168
viewsA: Error in function Optim
In logistic regression the function to be maximized is as follows: If you were to implement it in the R, it would look like this (the Optim minimizes, so the - in front). logL <- function(par,…
ranswered Daniel Falbel 12,504 -
3
votes1
answer647
viewsA: Change facet_wrap order in ggplot2
Follow a code for this. The ggplot sorts charts according to the order of the variable’s factors Subject, if it is not a factor it converts into factor first. So what I did here was recreate the…
-
2
votes1
answer1316
viewsA: Join 3 graphs using Plot
I was able to do ggplot2, based on the code I did for your other question: Changing charts of a regression The code wasn’t pretty, but it worked. The ugly part is where I put those red lines, which…
-
2
votes1
answer84
viewsA: Message at the end of na.omit
This message is not a problem... It is only attributes that the function puts in the resulting matrix. An array with these attributes will continue to behave like an array. Just ignore them. That…
-
4
votes2
answers667
viewsA: How to save a series of xls spreadsheets in csv using R?
If all these worksheets are in a single folder, you can use something like this. I’m guessing that all worksheets have the same format. plans <- list.files("caminho/da/pasta", full.names = T)…
ranswered Daniel Falbel 12,504 -
1
votes1
answer82
viewsA: How to ignore links that do not fit the established conditions and continue with scraping?
In your case, I think a if already solves, for example by replacing the line you place in the database with: if (length(titulo) == 1 & length(data_hora == 1) & length(texto) == 1){ dados…
-
7
votes5
answers1065
viewsA: Remove duplicated names with regular expression
It’s not pretty but it worked: library(stringr) rex <- ".*, [:alpha:]{1,}[A-Z]{1}" nomes_invertidos <- str_extract_all(presidentes, rex) %>% unlist() %>% str_sub(end = -2)…