grouby of cells of a dataframe returns Error in Eval (substitute (expr), send, Enclos): non-numerical argument for binary operator

Asked

Viewed 68 times

1

I want to calculate the differences between cells in a dataframe of a column.

> head(df_benchmark)
        Date .STOXXR
1 2000-01-03  478.52
2 2000-01-04  459.53
3 2000-01-05  448.19
4 2000-01-06  446.24
5 2000-01-07  455.80
6 2000-01-10  462.97

The data file is here.

So I tried the following:

df_benchmark <- read.xlsx("Data.xlsx", sheet = "Benchmark", startRow = 2,colNames = TRUE, detectDates = TRUE, skipEmptyRows = FALSE)
df_benchmark[2] <- data.matrix(df_benchmark[2])

> library(lubridate)
> percent_change2 <- function(x)last(x)/first(x) - 1
> monthly_return_benchmark <- df_benchmark %>% 
+   group_by(gr = floor_date(Date, unit = "month")) %>%
+   summarize_at(vars(-Date, -gr), percent_change2) %>%
+   ungroup() %>%
+   select(-gr) %>% 
+   as.matrix() 

I still have an error. It seems that my dataframe with a date column and a value column should be an atomic vector or list of 1d. It seems that reason is a variable problem:

Error: Each variable must be a 1d atomic vector or list.
Problem variables: '.STOXXR'

Another attempt

I used df_benchmark[2] <- data.matrix(df_benchmark[2]). This was used because I had a formatting problem. I tried without. So I tried the following code:

library(xts)
df_benchmark <- read.xlsx("Data.xlsx", sheet = "Benchmark", startRow = 2,colNames = TRUE, detectDates = TRUE, skipEmptyRows = FALSE)

percent_change2 <- function(x) last(x)/first(x) - 1
monthly_return_benchmark <- df_benchmark %>% 
  group_by(gr = floor_date(as_date(Date), unit = "month")) %>%
  summarize_at(vars(-Date, -gr), percent_change2) %>%
  ungroup()

# Generamos el xts, indicando la columna con la info de tiempo
monthly_return_benchmark <- xts(monthly_return[,-1], order.by=monthly_return$gr)

But I made the following mistake:

> library(xts)
> df_benchmark <- read.xlsx("Data.xlsx", sheet = "Benchmark", startRow = 2,colNames = TRUE, detectDates = TRUE, skipEmptyRows = FALSE)
> 
> percent_change2 <- function(x) last(x)/first(x) - 1
> monthly_return_benchmark <- df_benchmark %>% 
+   group_by(gr = floor_date(as_date(Date), unit = "month")) %>%
+   summarize_at(vars(-Date, -gr), percent_change2) %>%
+   ungroup()
Error in eval(substitute(expr), envir, enclos) : 
  non-numeric argument to binary operator

Third attempt

Only using the group_by using the months as drives I get the following:

> monthly_return_benchmark <- df_benchmark %>% 
+     group_by(gr = floor_date(Date, unit = "month"))
> monthly_return_benchmark
Source: local data frame [4,604 x 3]
Groups: gr [216]

         Date .STOXXR         gr
       <date>   <chr>     <date>
1  2000-01-03  478.52 2000-01-01
2  2000-01-04  459.53 2000-01-01
3  2000-01-05  448.19 2000-01-01
4  2000-01-06  446.24 2000-01-01
5  2000-01-07   455.8 2000-01-01
6  2000-01-10  462.97 2000-01-01
7  2000-01-11  459.85 2000-01-01
8  2000-01-12  459.84 2000-01-01
9  2000-01-13  462.27 2000-01-01
10 2000-01-14  473.43 2000-01-01
# ... with 4,594 more rows

Now I have to do the percent_change2 but it never worked.

1 answer

1

That seems to solve the problem:

txt <- "Data .STOXXR
2000-01-03  478.52
2000-01-04  459.53
2000-01-05  448.19
2000-01-06  446.24
2000-02-07  455.80
2000-02-10  462.97
"

library(tidyverse)
library(lubridate)

df_benchmark <- read.table(text = txt, header = TRUE, check.names = TRUE, 
                           colClasses = c("Date", "double")) %>% 
  as_tibble()


percent_change2 <- function(x)last(x)/first(x) - 1

df_benchmark %>% 
  group_by(gr = floor_date(Data, unit = "month")) %>% 
  summarise(variacao = percent_change2(.STOXXR))
#> # A tibble: 2 x 2
#>   gr         variacao
#>   <date>        <dbl>
#> 1 2000-01-01  -0.0675
#> 2 2000-02-01   0.0157

Created on 2019-03-29 by the reprex package (v0.2.1)

Sometimes to "debug" a code with pipe (%>%) it is good to rotate progressively each part that precedes a pipe. So it’s easy to see what’s going right and where’s the mistake.

With the code above, the lines that follow the summarise() can be added without problem:

df_benchmark %>% 
  group_by(gr = floor_date(Data, unit = "month")) %>% 
  summarise(variacao = percent_change2(.STOXXR)) %>% 
  ungroup() %>%
  select(-gr) %>% 
  as.matrix() 
#>         variacao
#> [1,] -0.06745800
#> [2,]  0.01573058

Browser other questions tagged

You are not signed in. Login or sign up in order to post.