How are you using data.table
, it is neither necessary nor recommended to divide the data into different data.frames; it is better to do operations per group.
library(data.table)
# Resumo dos dados por grupo
> Sample_data[, .(sum_price = sum(price)), by = code]
code sum_price
1: AAPL 881.9
2: AMZN 5542.1
3: MSFT 316.3
4: GOOG 4006.6
5: FB 205.2
# Criar uma nova coluna por grupo
Sample_data[, sum_price := sum(price), by = code]
> Sample_data
code date price sum_price
1: AAPL 2019-12-01 292.9 881.9
2: AAPL 2020-01-01 295.4 881.9
3: AAPL 2020-02-01 293.6 881.9
4: AMZN 2019-12-01 1847.4 5542.1
5: AMZN 2020-01-01 1849.3 5542.1
6: AMZN 2020-02-01 1845.4 5542.1
7: MSFT 2020-01-01 157.2 316.3
8: MSFT 2020-02-01 159.1 316.3
9: GOOG 2019-12-01 1337.1 4006.6
10: GOOG 2020-01-01 1335.8 4006.6
11: GOOG 2020-02-01 1333.7 4006.6
12: FB 2019-12-01 205.2 205.2
If you really need to separate the data into a data.Tables list, you can use the function data.table::split
, but it is always more efficient to do operations by groups.
> split(Sample_data, by = 'code')
$AAPL
code date price sum_price
1: AAPL 2019-12-01 292.9 881.9
2: AAPL 2020-01-01 295.4 881.9
3: AAPL 2020-02-01 293.6 881.9
$AMZN
code date price sum_price
1: AMZN 2019-12-01 1847.4 5542.1
2: AMZN 2020-01-01 1849.3 5542.1
3: AMZN 2020-02-01 1845.4 5542.1
$MSFT
code date price sum_price
1: MSFT 2020-01-01 157.2 316.3
2: MSFT 2020-02-01 159.1 316.3
$GOOG
code date price sum_price
1: GOOG 2019-12-01 1337.1 4006.6
2: GOOG 2020-01-01 1335.8 4006.6
3: GOOG 2020-02-01 1333.7 4006.6
$FB
code date price sum_price
1: FB 2019-12-01 205.2 205.2
If you prefer tidyverse syntax and functions, you’d better use Tibble as a format for data tables; the use of data.table in this case will not bring any gains. In both cases, keeping the data in a single table and doing operations per group is preferable to separating the data.
library(dplyr)
Sample_data <- tibble(code = c("AAPL","AAPL","AAPL", "AMZN","AMZN","AMZN", "MSFT","MSFT", "GOOG","GOOG","GOOG", "FB"), date = c("2019-12-01","2020-01-01","2020-02-01", "2019-12-01","2020-01-01","2020-02-01", "2020-01-01","2020-02-01", "2019-12-01","2020-01-01","2020-02-01", "2019-12-01"), price = c(292.9,295.4,293.6, 1847.4,1849.3,1845.4, 157.2,159.1, 1337.1,1335.8,1333.7, 205.2 ) )
> Sample_data %>% group_by(code) %>% summarise(sum_price = sum(price))
# A tibble: 5 x 2
code sum_price
<chr> <dbl>
1 AAPL 882.
2 AMZN 5542.
3 FB 205.
4 GOOG 4007.
5 MSFT 316.
Sample_data %<>% group_by(code) %>% mutate(sum_price = sum(price))
> Sample_data
# A tibble: 12 x 4
# Groups: code [5]
code date price sum_price
<chr> <chr> <dbl> <dbl>
1 AAPL 2019-12-01 293. 882.
2 AAPL 2020-01-01 295. 882.
3 AAPL 2020-02-01 294. 882.
4 AMZN 2019-12-01 1847. 5542.
5 AMZN 2020-01-01 1849. 5542.
6 AMZN 2020-02-01 1845. 5542.
7 MSFT 2020-01-01 157. 316.
8 MSFT 2020-02-01 159. 316.
9 GOOG 2019-12-01 1337. 4007.
10 GOOG 2020-01-01 1336. 4007.
11 GOOG 2020-02-01 1334. 4007.
12 FB 2019-12-01 205. 205.
> Sample_data %>% group_by(code) %>% group_split()
[[1]]
# A tibble: 3 x 4
code date price sum_price
<chr> <chr> <dbl> <dbl>
1 AAPL 2019-12-01 293. 882.
2 AAPL 2020-01-01 295. 882.
3 AAPL 2020-02-01 294. 882.
[[2]]
# A tibble: 3 x 4
code date price sum_price
<chr> <chr> <dbl> <dbl>
1 AMZN 2019-12-01 1847. 5542.
2 AMZN 2020-01-01 1849. 5542.
3 AMZN 2020-02-01 1845. 5542.
[[3]]
# A tibble: 1 x 4
code date price sum_price
<chr> <chr> <dbl> <dbl>
1 FB 2019-12-01 205. 205.
[[4]]
# A tibble: 3 x 4
code date price sum_price
<chr> <chr> <dbl> <dbl>
1 GOOG 2019-12-01 1337. 4007.
2 GOOG 2020-01-01 1336. 4007.
3 GOOG 2020-02-01 1334. 4007.
[[5]]
# A tibble: 2 x 4
code date price sum_price
<chr> <chr> <dbl> <dbl>
1 MSFT 2020-01-01 157. 316.
2 MSFT 2020-02-01 159. 316.
attr(,"ptype")
# A tibble: 0 x 4
# … with 4 variables: code <chr>, date <chr>, price <dbl>, sum_price <dbl>