Return of the gradient function when using Optim and colSums functions

Asked

Viewed 127 times

3

I can’t understand why when using the function gradlik as a function argument Optim I get the following error:

Error in optim(beta, loglik, gradlik, method = "BFGS", hessian = T, control = list(fnscale = -1)) : 
gradiente em optim retorna um objeto de comprimento 9000 ao invés de 9

However, when calling the function gradlik(beta) the same returns me the gradient vector as expected!

Does anyone have any suggestions for the correction of this code?

loglik <- function(beta) {
  NXS <- dim(model.matrix(~XS))[2]#Número de colunas de XS+1
  NXO <- dim(model.matrix(~XO))[2]#Número de colunas de XO+1
  ## parameter indices
  ibetaS <- 1:NXS
  ibetaO <- seq(tail(ibetaS, 1)+1, length=NXO)
  isigma <- tail(ibetaO, 1) + 1
  irho <- tail(isigma, 1) + 1
  g <- beta[ibetaS]
  b <- beta[ibetaO]
  sigma <- beta[isigma]
  if(sigma < 0) return(NA)
  rho <- beta[irho]
  if( ( rho < -1) || ( rho > 1)) return(NA)

  XS.g <- model.matrix(~XS) %*% g
  XO.b <- model.matrix(~XO) %*% b
  u2 <- YO - XO.b
  r <- sqrt( 1 - rho^2)
  B <- (XS.g + rho/sigma*u2)/r
  ll <- ifelse(YS == 0,
               (pnorm(-XS.g, log.p=TRUE)),
               dnorm(u2/sigma, log = TRUE) - log(sigma) +
                 (pnorm(B, log.p=TRUE))
  )
  sum(ll)
}

gradlik <- function(beta) {
  NXS <- dim(model.matrix(~XS))[2]#Número de colunas de XS+1
  NXO <- dim(model.matrix(~XO))[2]#Número de colunas de XO+1
  nObs <- length(YS)
  NO <- length(YS[YS > 0])
  nParam <- NXS + NXO + 2 #Total of parameters

  XS0 <- XS[YS==0,,drop=FALSE]
  XS1 <- XS[YS==1,,drop=FALSE]
  YO[is.na(YO)] <- 0
  YO1 <- YO[YS==1]
  XO1 <- XO[YS==1,,drop=FALSE]
  N0 <- sum(YS==0)
  N1 <- sum(YS==1)

  w  <- rep(1,N0+N1 )
  w0 <- rep(1,N0)
  w1 <- rep(1,N1)
  NXS <- dim(model.matrix(~XS))[2]#Número de colunas de XS+1
  NXO <- dim(model.matrix(~XO))[2]#Número de colunas de XO+1
  ## parameter indices
  ibetaS <- 1:NXS
  ibetaO <- seq(tail(ibetaS, 1)+1, length=NXO)
  isigma <- tail(ibetaO, 1) + 1
  irho <- tail(isigma, 1) + 1

  g <- beta[ibetaS]
  b <- beta[ibetaO]
  sigma <- beta[isigma]
  if(sigma < 0) return(matrix(NA, nObs, nParam))
  rho <- beta[irho]
  if( ( rho < -1) || ( rho > 1)) return(matrix(NA, nObs, nParam))
  XS0.g <- as.numeric(model.matrix(~XS0) %*% g)
  XS1.g <- as.numeric(model.matrix(~XS1) %*% g)
  XO1.b <- as.numeric(model.matrix(~XO1) %*% b)
  #      u2 <- YO1 - XO1.b
  u2 <- YO1 - XO1.b
  r <- sqrt( 1 - rho^2)
  #      B <- (XS1.g + rho/sigma*u2)/r
  B <- (XS1.g + rho/sigma*u2)/r
  lambdaB <- exp( dnorm( B, log = TRUE ) - pnorm( B, log.p = TRUE ) )
  gradient <- matrix(0, nObs, nParam)
  gradient[YS == 0, ibetaS] <- - w0 * model.matrix(~XS0) *
    exp( dnorm( -XS0.g, log = TRUE ) - pnorm( -XS0.g, log.p = TRUE ) )
  gradient[YS == 1, ibetaS] <- w1 * model.matrix(~XS1) * lambdaB/r
  gradient[YS == 1, ibetaO] <- w1 * model.matrix(~XO1) * (u2/sigma^2 - lambdaB*rho/sigma/r)
  gradient[YS == 1, isigma] <- w1 * ( (u2^2/sigma^3 - lambdaB*rho*u2/sigma^2/r) - 1/sigma )
  gradient[YS == 1, irho] <- w1 * (lambdaB*(u2/sigma + rho*XS1.g))/r^3
  return(colSums(gradient))
}

n=1000
X1 <- runif(n)
X2 <- runif(n)
XO <- cbind(X1,X2)
X3 <- runif(n)
XS <- cbind(X1,X2,X3)
YS <- sample(c(0,1),n,replace = TRUE)
YO <- sample(100:400,n,replace = TRUE)*YS
beta <- c(1,1,1,1,1,1,1,1,0.5)
#Note que a função abaixo compila normalmente:
gradlik(beta)
#Porém a função Optim não compila:
theta <-optim(beta,loglik, gradlik, method = "BFGS",hessian = T,control=list(fnscale=-1)) 
theta$par
  • The function optim does not turn to me. I get the following error: Error in optim(beta, loglik, gradlik, method = "BFGS", hessian = T, control = list(fnscale = -1)) : &#xA; gradient in optim evaluated to length 9000 not 9

  • @Marcusnunes, I added another code in the question that can help understand my problem! The Optim function is not actually compiling with the initial code, it seems that it is having some conflict with the colSums function!

  • @Marcusnunes modified the question in an attempt to make it clearer!

1 answer

1


The gradient function needs to return a vector with the same size as the number of parameters.

In its function, there are two occasions when this does not occur.

When sigma <0 you return the following value in your function:

if(sigma < 0) return(matrix(NA, nObs, nParam))

This will be a 9000 x 9 matrix that error that the optim() gives you. You have to change this part of the code to something that returns only 9 elements and not an array.

In the same way, when ( rho < -1) || ( rho > 1) its function returns:

if( ( rho < -1) || ( rho > 1)) return(matrix(NA, nObs, nParam))

What is a 9000 x 9 matrix in your example, causing error.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.