SVM model - regularisation and cost factors

Question

SVM model - regularisation and cost factors

Asked 6 years, 3 months ago

Viewed 323 times

0

In the SVM model (kernel: linear, regularisation factor: 0.1, cross validation k=10). Would the regularization factor and cost factor be the same? No help of R informs only the argument cost.

library(e1071)
library(caret)
modelo = svm(formula = gravidez ~ ., 
  data = treinamento, 
  type = 'C-classification', 
  kernel = 'linear', cost = 1.0, cross = 10)
previsor = predict(modelo, newdata = teste[-19])

1 answer

Browser other questions tagged r

You are not signed in. Login or sign up in order to post.

by Flavio Barros • **1,717** points · Answer 1 · 2019-04-07T03:05:37+00:00

Parameter C is the penalty associated to instances that are either misclassified or violate the maximal margin. What the SVM algorithm basically does is to find the largest margins for linearly separable data as shown in this figure:

Meanwhile there may be data sets that are not linearly separable. In these cases some classification errors are admitted under the penalty associated with C.

The effect of a high cost is that the final model will tend to obtain as few classification errors as possible. This can result in very specific boundaries to the data set and therefore an over-balancing of the model to the data set. A lower C generates smoother decision boundaries, which despite having more erroneous classifications generates models that can be better generalized.

Now regarding the regularization factor, this term is more used with regard to regression, as techniques such as LASSO. However the idea is the same as that of factor C. Just to illustrate how they are equivalent below we have the likelihood with a cost factor C:

and the equivalent function with a regularization factor as is usually written in regression:

In practice the effect is the same, either with C or with lambda. So I believe that in your implementation or you specify one or the other, it makes no sense to specify the two since they have the same effect.