Posts by Marcus Nunes • 17,915 points
372 posts
-
4
votes1
answer70
viewsA: Generate multiple samples from the same vector
This problem can be solved with the function combn. Just inform the vector that owns your population (x) and the sample size to be generated (2): x <- c(1,2,3,4) t(combn(x, m=2)) [,1] [,2] [1,] 1…
ranswered Marcus Nunes 17,915 -
8
votes1
answer183
viewsA: What is a Shiny app?
What is a shiny? shiny is a package from R which provides a web framework for the programmer. A web framework is a set of pre-established instructions that facilitate the creation of web pages. This…
-
4
votes2
answers49
viewsA: Extract vectors from a vector set of vector names and merge into a single vector
First I will create the data to be used in this problem: teste <- c("vetor_arte", "vetor_rio", "vetor_parque") vetor_arte <- c(1, 3, 4, 11, 12, 13, 14, 16, 29, 30, 41) vetor_rio <- c(6, 7,…
-
5
votes1
answer3753
viewsA: How to change the y-axis scale on a line chart
One way to do this is by creating a vector with the function seq. As the name suggests, the function seq creates a sequence of numbers. Simply enter the initial value, the final value and the…
-
23
votes1
answer510
viewsQ: Why learn different algorithms that solve the same problem?
I don’t have training in computer science. For example, whenever I want to sort a number vector x in one of the programming languages I use, just run sort(x) and everything is solved. However, the…
-
4
votes2
answers1524
viewsA: Change <Chr> to number in R
The problem is that the R understands that dadosarrumados[, c(4, 5)] is a list: is.list(dadosarrumados[, c(4, 5)]) [1] TRUE One way to solve this problem is to undo the list and then convert to…
ranswered Marcus Nunes 17,915 -
7
votes2
answers164
viewsA: Remove all Environment elements containing numbers and uppercase letters
You can do that with the R also, by means of the following code: ls() [1] "vector" "vector1" "vector86" "vectorA" "vectorU" The function ls() lists all environment variables and their result is a…
ranswered Marcus Nunes 17,915 -
6
votes3
answers158
viewsA: Minor Date in a Dataset
This problem is very easy to solve using the package dplyr. The first thing to do is turn the column DATA in date, so that the R can establish an order relation for it. I’ll just copy your original…
ranswered Marcus Nunes 17,915 -
4
votes1
answer115
viewsA: Customize x-axis values (abscissas) in the geom_smooth or geom_ribbon functions of the ggplot2 package
I didn’t understand why turn TEMPO, which is a continuous variable, in a categorical variable through the function factor. See that when leaving TEMPO as it should be, I can use scale_x_continuous…
-
3
votes2
answers87
viewsA: Separate data from a variable
The function sub_str package stringr allows us to separate a string in the R according to their number of characters and their respective positions. For example, for Açucena - MG, we have 12…
ranswered Marcus Nunes 17,915 -
7
votes1
answer1149
viewsA: How to calculate percentage change with 3 variables in R
One way to do this with a few lines of code is through the package dplyr. In addition, I recommend studying it and the tidyverse if you want to learn to manipulate data effectively in the R. I…
-
8
votes1
answer307
viewsA: How to fill column charts with hachuras using ggplot2
Use the package ggtextures, available at this link. devtools::install_github("clauswilke/ggtextures") library(ggplot2) library(ggtextures) images = c( compact =…
-
1
votes1
answer99
viewsA: Doubt python model creation machine Learning
X_train shall have all the variables necessary to predict the value of Y_train. If Y_train has only the column PSS_Stress, then X_train will have all other columns of your dataset, except for…
-
4
votes1
answer118
viewsA: Cluster analysis by groups
This function is not suitable for this action, at least not the way it is being used here. The trick is to use the function nest package tidyr: library(dplyr) library(tidyr cluster <- dataset…
ranswered Marcus Nunes 17,915 -
8
votes1
answer91
viewsA: How to solve the 53 categories limit of R randomForest?
First, ask yourself if you really need to have a categorical variable with this amount of levels. When dividing a factor of n levels, the Forest Random performs 2 n-2 possible divisions of this…
-
4
votes1
answer359
viewsA: What is rank-deficient and how to get around that?
The general linear regression formula is given by It can be represented in a matrix form through the relation where Y and Epsilon are vectors of n elements and X is a matrix given by The least…
-
5
votes1
answer320
viewsA: Recursion in R error
Your function has a logic problem. If v is null, it’s all right: it stops. But if v not null, it will continue to recur indefinitely. Suppose, as in your example, that v <- c(1,2,3,5): v <-…
-
5
votes1
answer1124
viewsA: What is the difference between Numeric and integer vectors?
Why does this happen? The R has two classes of numbers: integer and Numeric. The class integer is only used to record whole numbers, while Numeric serves to record real numbers (although, if I…
ranswered Marcus Nunes 17,915 -
4
votes1
answer369
viewsA: How to run str_detect (stringr) for more than one variable and at once?
There is an error in the code. The first parenthesis is closing at the end of the second filter. I use asterisks to highlight this in the code below: library(dplyr) library(stringr)…
-
4
votes1
answer147
viewsA: Error importing decimal numbers into R
The function read_delim cannot handle comma numbers. Use the function read.csv with the argument dec=",": read.csv(file="arquivo.csv", dec=",", sep=";") Note that I also used the argument sep=";"…
ranswered Marcus Nunes 17,915 -
5
votes1
answer80
viewsA: Filter lines without knowing the column name in R
Use the function which. See the data set below: library(ggplot2) mpg # A tibble: 234 x 11 manufacturer model displ year cyl trans drv cty hwy fl class <chr> <chr> <dbl> <int>…
ranswered Marcus Nunes 17,915 -
2
votes1
answer143
viewsA: Optimization of R code
From what I’ve analyzed of your code, your suspicion is correct: the function perm is what is slowing down the code. I came to this conclusion by testing the function execution time topswops:…
ranswered Marcus Nunes 17,915 -
7
votes1
answer377
viewsQ: Size of panels with facet_wrap
I’m making some panel charts on ggplot2. See the example below: library(ggplot2) ggplot(mpg, aes(x=displ, y=hwy)) + geom_point() + geom_smooth(method="lm", se=FALSE, colour="black") + facet_wrap(~…
-
5
votes1
answer369
viewsA: Website with R Markdown
Apparently your code is correct. I ran it on my PC (it’s a Mac, actually) and, unless of the special characters (as in music, for example), everything went well: This caught my attention about a…
-
4
votes1
answer58
viewsA: Lapply does not return the desired result for some functions
As very well pointed out by Rui’s comment, the function shapiro.test is only set to vectors. But nothing prevents us from creating a version for it that can be applied in columns of data frames:…
-
1
votes1
answer39
viewsA: Use lapply (replacement for) to leave only columns in common for multiple dataframes in a new list
Use the package plyr to merge the original list into a data frame: library(plyr) res <- ldply(dados, data.frame) res is a data frame with 3 columns: a, b and c. Like b and c are not present in…
-
6
votes1
answer146
viewsA: How to create a loop that turns columns into variables and returns Shapiro.test at the end?
The R is not a very good language to use loops like for and while. Depending on the number of replications and their complexity, execution may become very slow. However, it has some functions that…
ranswered Marcus Nunes 17,915 -
6
votes2
answers632
viewsA: Ks.test and p-value < 2.2e-16
Probably what I wrote here won’t completely answer the question, but the comment space is too small for what I have to say. It does not seem correct to raise the hypothesis that these data are…
ranswered Marcus Nunes 17,915 -
3
votes1
answer415
viewsA: How to change dates from the American format y/m/d to d/m/y in an R dataframe?
The function dmy converts dates into day/month/year format only. Note that the Date column has a time in addition to the date. Use the function dmy_hms, in which hms means hour, minute and second…
ranswered Marcus Nunes 17,915 -
2
votes2
answers53
viewsA: Generate multiple graphics in a loop using X11() and two different indices in R
Each x11() is a new graphical window. Therefore, just open a new window whenever the previous graphics are all plotted. The same goes for par(mfrow=c(3,2)). This command only serves to define the…
ranswered Marcus Nunes 17,915 -
1
votes1
answer49
viewsA: Complete Separation in Hurdle Model
Short answer No. The likelihood function will not be able to be maximized and this will affect the estimation of the parameters of the logistic part of the model. Not So Short Answer It depends. It…
ranswered Marcus Nunes 17,915 -
2
votes2
answers49
viewsA: Pulling elements from one list to another under a R criterion
Use the function %in%. If it’s used as x %in% y she crosses x with y. It returns a vector of the size of x, informing the index in y in which are the elements of x. In practice, see what happens in…
ranswered Marcus Nunes 17,915 -
2
votes1
answer320
viewsA: Notice: "In sqrt(diag(Object$vcov)): Nans produced" in Hurdle Model
Generalized linear models don’t do magic. It’s no use having data, trying to adjust a model to them and believing that everything will work out. Also, it is very difficult (perhaps impossible) to…
-
6
votes2
answers1529
viewsA: Remove NA in a Data Frame
Assuming the dataset is called dados, turn the following command: write.csv(dados, file="NomeDoArquivo.csv", na=" ", row.names=FALSE, quote=FALSE) in which na=" " says that all NA in dados shall be…
-
4
votes1
answer1073
viewsA: Space between title and table
The command \vspace serves to add or remove vertical spacing in Latex. If a positive number is used, it adds space. If a negative number is used, it removes. In the case of your table, I removed 5mm…
latexanswered Marcus Nunes 17,915 -
1
votes1
answer250
viewsA: Error generating graph with package ggplot2 in R
The main problem was to call the variables to be plotted as strings, placing them in quotes. See that in my code below I call them directly. ggplot(gpv, aes(x=INT, y=PV, fill=INT)) +…
-
3
votes1
answer1242
viewsA: Removing regulartable decimal places() flextable package {R}
Apparently the flextable does not allow you to format the number of decimal places of output of numbers. But nothing prevents us from using the function formatC for that reason: library(tidyverse)…
-
1
votes1
answer1088
viewsQ: Copy batch files by renaming them with the original directory name
Suppose I have a directory structure organized as follows on my PC: Diretorio 01 Arquivo 01.jpg Arquivo 02.jpg Arquivo 03.jpg Arquivo 04.jpg Diretorio 02 Arquivo 01.jpg Arquivo 02.jpg Diretorio 03…
-
5
votes1
answer231
viewsA: Alternative to Miktex
Unlike Linux and macOS, Latex distributions for Windows are rubbish. Almost always gives some conflict and it is very difficult to fix without access to a good terminal, thing that Windows does not…
-
3
votes1
answer248
viewsA: Breaking lines in colnames or rownames
I recommend using the package kableExtra to format tables in knitr or sweave. It works in conjunction with the function kable package knitr and the results are very nice. See below: Follow the code…
ranswered Marcus Nunes 17,915 -
2
votes1
answer35
viewsA: Take previous values of a variable if the current value is 0 with a condition using dplyr in R
Your intuition was correct. Yes you can use the dplyr: dados %>% filter(alto=="s") %>% group_by(CNPJ) %>% mutate(dataquebra2 = max(dataquebra)) # A tibble: 16 x 5 # Groups: CNPJ [5] CNPJ…
-
4
votes2
answers1338
viewsA: r - sum of a variable relative to the values of another variable in a data frame
Another way to do this is with the package dplyr: library(dplyr) dados %>% group_by(campanha, especie) %>% summarise(sum(frequencia)) # A tibble: 6 x 3 # Groups: campanha [?] campanha especie…
-
4
votes1
answer159
viewsA: Use of summarySE function for graph construction in ggplot2
This problem has no solution with the provided dataset. The function help summarySE says the following (my griffin): Gives Count, Mean, standard deviation, standard error of the Mean, and Confidence…
-
5
votes1
answer724
viewsA: Barplot no Rstudio - 2 Columns
The easiest way to solve this is with the package ggplot2. But first we need to put the data in the so-called long format, using the function melt package reshape2: library(reshape2)…
-
3
votes1
answer65
viewsA: Graph is not exiting in output
The problem is on the line curve(poder.media(n,x),40,60,xlab=expression(mu),ylab = "Poder",add=TRUE,col="red") When making the argument add=TRUE, the R is informed that it needs to add the result of…
-
5
votes1
answer323
viewsA: What does the function of %% and %any% on the r?
The function %% is the module function, in the sense of modular arithmetic. Note that the result of 17 %% 3 is 2, for 17 = 3*5 + 2. Also, see that the result of 17 %/% 3 is 5, complementing the…
-
4
votes2
answers1017
viewsA: How to insert a chart caption with two axes y in r?
I think the code below solves the problem satisfactorily, at least according to my criteria. ggplot(obs, aes(x = Mes)) + geom_bar(aes(y = Acid, color="Bruto"), fill="darkblue", stat = "identity",…
-
11
votes2
answers1974
viewsA: Modify gradient colors in graphs in ggplot2
Use the function scale_fill_gradient for this: ggplot(data=dados, aes(x=Freq, y=Tipo, fill=Freq)) + geom_label(label=rownames(dados), color="black", size=3) + labs(x = "Frequência", y = "Tipo") +…
-
4
votes1
answer248
viewsA: OSMAR library (Openstreetmap) error
This error is a problem of accessing the open street map server where the data is stored. Basically, the package is outdated and tries to access using the http protocol, but currently only accesses…
ranswered Marcus Nunes 17,915 -
3
votes1
answer60
viewsA: How to make two loops for the same code just go varying the column of the courses and the columns of the questions?
I think a better way than using loops is to use packages like dplyr and reshape: library(dplyr) library(reshape2) df %>% melt(id="cursos") %>% dcast(value ~ variable + cursos,…