Posts by Marcus Nunes • 17,915 points
372 posts
-
3
votes1
answer495
viewsA: How to adjust the size of an already embedded geom_plot to another ggplot2 chart?
Use the arguments vp.width and vp.height within the aes of geom_plot. Both arguments vp.width and vp.height range from 0 to 1, where 0 is the lowest possible value for the inserted chart and 1 is an…
-
4
votes1
answer42
viewsA: Create new rows and columns for non-existent values
The first thing I would do is warn R what are the values that anos can assume. In this case, these values go from 1988 to 2014. For me, the best way to do this is by converting the column anos in…
-
1
votes1
answer199
viewsA: R update on Mac
Depending on the type of update to be made, there is no need to reinstall any package in R. For example, if the update is from version 3.x.0 to 3.x.1 (that is, if the second number is constant),…
-
5
votes1
answer121
viewsA: How to plot two geom_line (one from each group) into a single Plot in ggplot2
The ggplot2 gets lost in your graph with two groups because, in reality, you have two grouping variables for your problem: grupo (levels A and B) and variable (with levels v1 to v5). One way to…
-
2
votes1
answer142
viewsA: Create table Stargazer function mq in R
Note that at no time the function mq returns an object with the values of the Ljung-Box statistics. All results are placed directly at the prompt via the command printCoefmat: function (x, lag = 24,…
ranswered Marcus Nunes 17,915 -
4
votes2
answers317
viewsA: Read error with the fread function of the data.table package
The file is filled incorrectly in the source. As far as I know, it will be impossible to read it, this way, inside the R. I discovered this when running, in the terminal, the command cat…
ranswered Marcus Nunes 17,915 -
3
votes2
answers140
viewsA: Delete lines with a specific string
Every position of a data frame df R can be accessed via the command df[x, y] In which x is the line of interest and y is the column of interest. However, when rotating df[x, ], without specifying…
ranswered Marcus Nunes 17,915 -
6
votes1
answer532
viewsA: Problems to generate pdf file and equations in the file via R Markdown
The problem is in Latex notation. Change your formulas to $$\sqrt{\frac{a}{b}}$$ $$\forall x \exists y(F(x,y)) \to Q(y,x))$$ $$s = \sqrt{\frac {\sum_{i=1}^N(x- \bar{x})^2} {N -1} }$$ Note that the…
-
5
votes2
answers104
viewsA: R straight location of x in y
I don’t see an automated solution for this for two reasons. You do not have (or at least did not provide) a function that relates x and dado. In this way, it is impossible to calculate dado from x.…
ranswered Marcus Nunes 17,915 -
4
votes3
answers2072
viewsA: How to join two data.frames of different sizes per column in R?
Assuming that the data frames already have a column that serves as identification, you can use the dplyr::full_join directly. Taking advantage of the simulation presented in this other answer, we…
ranswered Marcus Nunes 17,915 -
2
votes1
answer545
viewsA: Create new column from existing column in R
The code below does the desired. library(dplyr) tab_mini <- head(tab) tab_mini %>% mutate(Cd_Disciplina_Simples = sub("^([[:alpha:]]*).*", "\\1", Cd_Disciplina)) %>% mutate(Curso =…
-
7
votes1
answer260
viewsQ: What is Data Science?
The Stack Overflow matrix has a site called Data Science SE. I, who am an active member of the tag R, I realize that there are few discussions on this subject elsewhere in the Brazilian version of…
-
4
votes2
answers290
viewsA: Create a bar graph in ggplot2 with juxtaposed columns
The secret is to put the data in long format. One way to do this is by using the package reshape2: library(reshape2) library(ggplot2) Mean_2013 <-…
-
3
votes1
answer177
viewsA: Error in the application of the Boxcox function of the MASS package
In addition to the model adjusted to the data, it is necessary to inform the function MASS::boxcox where this data is stored: boxcox(m0, data = df) Error in boxcox.default(m0, data = df) : response…
ranswered Marcus Nunes 17,915 -
3
votes1
answer114
viewsA: Problems in the covariance structure
The first thing I would do, before proceeding with the analysis, is to organize the dataset. It is not necessary to use nor attach nor convert variables whenever it is to adjust a different model:…
-
5
votes2
answers291
viewsA: Convert written numbers with thousands separator to numeric value in R
It is also possible to solve the problem with R groundwork: a <- as.character("353.636.000.000") as.numeric(gsub("\\.", "", a)) ## [1] 3.53636e+11 The function gsub is equivalent to a command of…
ranswered Marcus Nunes 17,915 -
3
votes1
answer465
viewsA: Change colors of the scale of a bubble chart in R
I’ll start with the last question: I would also like to know why the graph is taking the axes out of order Despite the result below, the axes do not have values out of order. Note that when rotating…
-
5
votes1
answer51
viewsA: How to store the password in the R script in the package 'Encryptr'?
It is impossible to do that, at least at the time I am writing this answer. If you go to package page on github, specifically on the part of issues, will see that one of the programmers replied the…
-
1
votes1
answer118
viewsA: How to use external functions within the server using Shiny?
Just put source() at the beginning of the code. Be histogramaVermelho.R a file with a function that makes a red histogram: histogramaVermelho <- function(x, breaks = 10){ hist(x, breaks,…
-
1
votes1
answer74
viewsA: Generate PDF with pdflatex
The message ! Latex Error: File `titlesec.sty' not found. informs that the package titlesec is being requested in the document preamble and is not installed. Remove it by deleting the line…
-
1
votes1
answer730
viewsA: How to calculate amplitude of each class in R?
By definition, the classes of a histogram constructed through the Sturges method will always have the same amplitude. Therefore, it is sufficient to take the total amplitude of the sample, given by…
-
2
votes1
answer27
viewsA: Unstructured boundary notation when it is in an exponent
Use \limits shortly after \lim: e^{ \lim\limits_{x \to + \infty} \left( { \ln \left( \dfrac{2x +3 }{2x + 1} \right) \cdot \ x } \right) } Particularly, when the exponent is complicated like this, I…
latexanswered Marcus Nunes 17,915 -
4
votes2
answers1868
viewsA: How to position the title in ggplot2 with theme_ipsum?
Add the line theme(plot.title = element_text(hjust = 0.5)) at the end of your code: (graf_B <- ggplot(dados, aes(x ="", y=Freq, fill=B)) + geom_bar(width = 1, stat = "identity") +…
-
5
votes1
answer44
viewsA: How to verify the presence of each element of a vector, line by line in a matrix in R?
The function %in% does exactly what is sought. Be the matrix a given by a <- matrix(c(1, 5, 2, 2, 1, 2, 3, 4, 1, 1, 3, 8, 9, 6, 7), nrow = 3, byrow = TRUE) And be the vector v with elements 1 to…
ranswered Marcus Nunes 17,915 -
5
votes1
answer207
viewsA: How to select the last 3 columns in a data frame in R?
The function tail of R standard selects the n last observations of a vector. For example, tail(letters, 10) ## [1] "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" selects the last 10 letters of the…
ranswered Marcus Nunes 17,915 -
1
votes1
answer159
viewsA: Selecting logistic regression variables
Try adjusting a logistic regression model with all predictive variables simultaneously: modelo <- glm(desfechouti ~ imc + as.factor(local) + as.factor(readm), data = data2, family = binomial())…
ranswered Marcus Nunes 17,915 -
4
votes1
answer49
viewsA: How to change the cut point (cut-off) in glm function?
After adjusting the template, create a vector with the predicted probabilities. For example, predict(my, newdata = dataset, type = "response") ## 1 2 3 4 5 6 7 8 ## 0.5939177 0.5464655 0.6365774…
ranswered Marcus Nunes 17,915 -
1
votes1
answer139
viewsA: Cross Validation following the dataset
The choice of the data is random. If the argument method is not defined, the method of resampling used is the bootstrap. By definition, bootstrap is a random (i.e., non-sequential) method of…
-
4
votes1
answer123
viewsA: Equivalence to kmeans inside Caret::Train
caret is the acronym for Classification Tond REgression TRaining. By definition, it is a package that provides algorithms for data classification and regression. Classification is what we call a…
-
2
votes1
answer371
viewsA: r Grouping columns and calculating averages in all columns
Create a new column with counts using mutate. Next, use summarise_each to tell which function should be applied to each variable, except the first. library(dplyr) temp %>% mutate(n = n()) %>%…
-
3
votes1
answer160
viewsA: How to decompose a time series using a frequency of 6 months?
This can be done computationally, but it doesn’t make mathematical sense. Imagine that you have the average monthly temperatures of Porto Alegre, a city with well-defined summer and winter. But you…
ranswered Marcus Nunes 17,915 -
2
votes1
answer198
viewsQ: Difference in Main Component Analysis (PCA) graphs
Today I was analyzing a data set and realized something I had never noticed before. In order to visualize a multivariate data set, I created your PCA and designed the observations into the two main…
-
3
votes1
answer49
viewsA: Select vector only with dates shared by all matrices in the list
Your code is correct. I came to another way to make the intersection between the data and got the same 841 observations that you got: datas.comuns <- as.Date(Reduce(intersect, lapply(lista, `[[`,…
ranswered Marcus Nunes 17,915 -
3
votes1
answer83
viewsA: Downlad books in Portuguese gutenbergr
This is a problem of character encoding. The Gutenberg Project uses latin1, but the R thinks it’s UTF-8 and then gives this error. The good part is that it is easily solvable: just convert from one…
-
2
votes1
answer37
viewsA: Barchart commands for R software
The arguments font.main and col.main are not used within the function lattice::barchart. They are reserved for the standard graphics of R. However, it is possible to change the title characteristics…
ranswered Marcus Nunes 17,915 -
6
votes2
answers250
viewsA: How to select strings that start with a given word
Use the function grep. It allows you to perform filters like this one, based only on a string snippet: ES_1_4 = ES_1_3[grep("REACTOME_", ES_1_3$Pathways), ] In the above command, the new object…
-
4
votes2
answers124
viewsA: R - problems converting txt to read in R
You can solve this problem by asking R skip the first line of the text file, using the argument skip = 1 inside the command read.table: x <- read.table(file = "file.txt", skip = 1) dim(x)…
ranswered Marcus Nunes 17,915 -
5
votes1
answer110
viewsA: Corrplot - How to adjust to the center in Rstudio?
Adjust the parameter oma (Outer margin area) of your chart. For example, library(corrplot) par(oma=c(0, 15, 15, 0)) corrplot(cor(mtcars)) par(oma=c(0, 0, 0, 0)) corrplot(cor(mtcars))…
-
4
votes1
answer1144
viewsA: How to train a decision tree in R?
Without going too deep into the theoretical part, a classification tree is a mathematical model that uses the decision tree structure to sort data. Better than explaining this in words is to see the…
-
3
votes7
answers1523
viewsA: How to calculate the median of a line in a date.frame in R?
Applying functions in lines is a very boring thing to do with dplyr. I prefer to transpose the data frame, solve the problem in the columns and then arrange the result in a new object: resultado…
-
5
votes2
answers812
viewsA: How to count the number of frequencies for each column in a date.frame in R?
A way to solve this problem using the R basic is through the function apply. using the data set provided by Tomás, we have the following: txt <- "USUARIO jan fev mar abr mai jun jul ago set out…
ranswered Marcus Nunes 17,915 -
7
votes2
answers292
viewsQ: Cumulative count of group occurrences on dates
I have a data set similar to the one below. It has a column with dates and another with occurrences of groups on these dates. data grupo 1 2019-01-01 a 2 2019-01-01 a 3 2019-01-01 a 4 2019-01-01 a 5…
-
4
votes1
answer123
viewsA: 'dot Plot' relative to mean with standard deviation
I believe the code below satisfies all that has been requested: Dataset %>% ggplot(aes(x = media, y = specie)) + geom_point(aes(fill=energetic_level, size=log(bodymass)), alpha = .9, pch=21,…
-
4
votes2
answers53
viewsA: How to know in which position an unknown is located in a vector?
Another way to solve is with the function which.max: a=c(10,9,8,7) which.max(a) [1] 1
ranswered Marcus Nunes 17,915 -
3
votes1
answer48
viewsA: Doubt with If statements in R
I’d like to understand why the R is misreading the keys and E When making a conditional structure with if and else, it is not necessary to put the logical test for the else. This makes the test…
-
1
votes1
answer407
viewsA: Error Reading Bovespa XML Files with R
I was able to read the data without problems using the package xml: library(XML) # primeiro arquivo b3 <- xmlTreeParse(file = "BVBG.086.01_BV000328201901310328000001834044379.xml") # segundo…
-
3
votes1
answer110
viewsA: Testing Accuracy of an ARIMA model
Though your function SARIMA place on the screen the results of the adjusted time series, the R does not understand that m is, in fact, the end result of the function. Make the following change and…
ranswered Marcus Nunes 17,915 -
6
votes3
answers2296
viewsA: How to format a "date" column inside of a data.frame in R?
I’m a fan of using the package lubridate to solve any and all problems with date. See below how easily I got what interested you with very intuitive name functions: library(lubridate) data <-…
-
5
votes2
answers661
viewsA: How to transform the class of a "factor" column into "date" within a data.frame?
I find the function as.Date quite bad. I often have problems with it and know not exactly how to solve. So I suggest using the package lubridate to work with dates. Instead of putting formats like…
-
8
votes4
answers4136
viewsA: Count equal values in one data frame and store in another in R
I wouldn’t try to reinvent the wheel and use a ready-made function on R to do this. library(dplyr) total_amostral %>% group_by(TOTAL) %>% count() # A tibble: 17 x 2 # Groups: TOTAL [17] TOTAL…
ranswered Marcus Nunes 17,915