1
I have a small database (872 Obs. of 27 variables).
But the analysis that I need to make of this base ends up being very heavy, because it requires analysis of the iteration of many variables among themselves.
I’m trying to accomplish a Confirmatory Factor Analysis (CFA) using the package Lavaan
.
However, the function stops running in the 4,521 interaction after a day of running. When I use Stata, the computer restarts at a certain time (about 10,000 iterations, if I’m not mistaken).
When it’s over (in R), I have a 200mb df and receive the following message on the console (which is the same as when I manually interrupt the operation):
Warning messages:
1: In lav_data_full(data = data, group = group, cluster = cluster, :
lavaan WARNING: some ordered categorical variable(s) have more than 12 levels: idade_coop n_pac membros cs_sobre_cooperados soma_pl_deposito ativocomp pl_sobre_ativos roa
2: In lav_samplestats_step2(UNI = FIT, wt = wt, ov.names = ov.names, :
lavaan WARNING: correlation between variables sul and sudeste is (nearly) 1.0
3: In lav_samplestats_step2(UNI = FIT, wt = wt, ov.names = ov.names, :
lavaan WARNING: correlation between variables ativocomp and soma_pl_deposito is (nearly) 1.0
4: In lav_model_estimate(lavmodel = lavmodel, lavpartable = lavpartable, :
lavaan WARNING: the optimizer warns that a solution has NOT been found!
5: In lav_model_estimate(lavmodel = lavmodel, lavpartable = lavpartable, :
lavaan WARNING: the optimizer warns that a solution has NOT been found!
6: In lav_model_estimate(lavmodel = lavmodel, lavpartable = lavpartable, :
lavaan WARNING: the optimizer warns that a solution has NOT been found!
7: In lav_model_estimate(lavmodel = lavmodel, lavpartable = lavpartable, :
lavaan WARNING: the optimizer warns that a solution has NOT been found!`
When I try to spin the summary
, receiving: lavaan 0.6-7 did NOT end normally after 4521 iterations
I believe it is stopping running due to lack of memory, since in Stata, the computer simply restarts.
Example of the code I’m using:
# Biblioteca ----
library(tidyverse)
library(haven)
library(semPlot)
library(lavaan)
# importando base ----
base <- read_dta("base.dta")
# Rodando o CFA ----
# Atribuindo grupos
mod_cfa <- 'AIL =~ idade_coop + n_pac + sudeste + sul + centro + norte + nordeste
CONS_SUP =~ reunioes_ano + estrutura_governanca + membros + comite
ESTR_CAP =~ cs_sobre_cooperados + soma_pl_deposito + ativocomp + pl_sobre_ativos + roa'
# Rodando cfa
cfa_coop <- cfa(mod_cfa,
data = base,
missing = "default",
estimator = "WLSMV",
orthogonal = FALSE,
ordered = names(base)
)
# Resultados
summary(cfa_coop, standardized = T, fit.measures = T, modindices = F)
fitMeasures(cfa_coop, c("chisq","df","pvalue","cfi","tli","rmsea"))
Example of the basis:
structure(list(cnpj = c("554656546", "767867868687", "132131232",
"876768", "786765", "786575", "78678686",
"65767568", "45678", "8675867"), niveis_superv = c("2",
"2", "2", "0", "0", "0", "2", "2", "2", "0"), classe_bc = c("02",
"02", "02", "01", "01", "02", "02", "02", "02", "01"), idade_coop = c(22,
22, 22, 22, 22, 22, 22, 21, 21, 21), n_pac = c(1, 10, 11, 1,
1, 3, 13, 4, 1, 1), sudeste = c(0, 1, 0, 1, 1, 0, 1, 1, 1, 1),
sul = c(0, 0, 1, 0, 0, 1, 0, 0, 0, 0), centro = c(0, 0, 0,
0, 0, 0, 0, 0, 0, 0), nordeste = c(1, 0, 0, 0, 0, 0, 0, 0,
0, 0), norte = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), atuacao_regional = c(1,
1, 1, 1, 1, 1, 1, 1, 1, 1), atuacao_estadual = c(0, 0, 0,
0, 0, 0, 0, 0, 0, 0), atuacao_nacional = c(0, 0, 0, 0, 0,
0, 0, 0, 0, 0), qtd_cooperados = c(1288, 3461, 11310, 1203,
4530, 3274, 7954, 3090, 983, 633), auditor = c("0", "1",
"1", "1", "1", "1", "1", "1", "1", "1"), contratar_auditoria_ind = c("2",
"2", "2", "2", "1", "2", "2", "2", "2", "2"), reunioes_ano = c(12,
12, 12, 12, 12, 12, 12, 12, 12, 12), estrutura_governanca = c("3",
"3", "3", "1", "2", "1", "3", "3", "2", "3"), membros = c(9,
7, 16, 6, 7, 9, 15, 7, 3, 3), comite = c("0", "0", "0", "1",
"0", "0", "0", "0", "0", "0"), cs_sobre_cooperados = c(6324.9228515625,
5602.01416015625, 6778.712890625, 790.086608886719, 1236.85620117188,
2393.3583984375, 6248.63232421875, 6032.5859375, 9310.8828125,
1582.30786132812), soma_pl_deposito = c(27017868, 75570352,
523851488, 1025653.1875, 6256179, 46703636, 409542080, 60845500,
10978892, 1100099.625), ativocomp = c(27371496, 143889792,
535524864, 1117028.25, 7135122.5, 63281840, 429233920, 93440432,
11219289, 1256903.25), pl_sobre_ativos = c(0.195353165268898,
0.0269169881939888, 0.0544663555920124, 0.611539125442505,
0.440605372190475, 0.0862450525164604, 0.0495623573660851,
0.0553100071847439, 0.432251751422882, 0.297396898269653),
roa = c(0.0260528121143579, 0.0159006342291832, 0.0089608347043395,
0.027274627238512, 0.0233467519283295, 0.00636459980159998,
0.0053424290381372, -0.0262128747999668, 0.0410496257245541,
0.0629174262285233), deposito_sobre_ativo = c(0.636883497238159,
0.341013759374619, 0.805505573749542, 0, 0, 0.594665348529816,
0.818691551685333, 0.405729651451111, 0.0734363198280334,
0), capital_social = c(8146501, 19388572, 76667240, 950474.1875,
5602958.5, 7835855, 49701624, 18640690, 9152598, 1001600.875
)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
))
PS: I know you have similar questions, but they deal with when the base is large, not when regression requires a lot of ram.
How do I change to "BFGS"? When I put in option
control
, of error.– RxT
about checking the RAM memory, unfortunately I run on the server, which is windows.
– RxT
There must be some option in windows to monitor RAM on the server, I do not know. To use the BFGS:
cfa(model, data = HolzingerSwineford1939, optim.method = "BFGS")
. Delete variables not solved?– Guilherme Parreira
Sorry it took so long! You see, he rode here! Thank you!
– RxT