One class classification with libsvm

A quick recap for what I want to do, I want to determine if a text is written by the same author or not. Thus I use one-class classification. In my training set (18 samples), it looks like this (for simplifying, I used x as data value):

1 1:x 2:x "until" 200:x
1 1:x 2:x "until" 200:x

In my testing set (3 samples), it looks like this (for simplifying, I used y as data value):

1 1:y 2:y "until" 200:y

For data preparation (training and testing set), I set upper and lower scaling limit to +1/-1

-l -1 -u 1

For training, I use svm_type is one class svm, kernel type is Sigmoid. Yet the accuracy is 0%

optimization finished, #iter = 13
obj = 22.901769047004553, rho = 5.476401914859387
nSV = 11, nBSV = 6
Accuracy = 0.0% (0/21) (classification)

Can someone show me what I did wrong here?

Answers


You need to tune the parameters.

nu is an upper bound on the fraction of training errors and a lower bound of the fraction of support vectors. With such setting, basically the amount nu(0.01 means 1% for example) of the data can be rejected and flagged as an outlier.

Also try to tune gamma and coef0 values in Sigmoid kernel.

Although it may not be the direct factor that causes your zero training accuracy, I suggest you scale the data by yourself instead of libsvm's maximum-minimum scaling, check standard scaling.

 x_mean = mean(x);
 x_std = std(x);
 x = (x - x_mean)./x_std;

Then use the same x_mean and x_std value to scale your test data.


Need Your Help

Remove successive 0th entries in args[] for a Java command line interface?

java arrays arraylist args

I recall seeing, somewhere, an example that stepped through String args[] by deleting the lowest numbered value(s)

Text in neighboring divs unaligned due to radio button?

c# asp.net css

Below is one item in my page, which consists of the div containing two side by side divs. The first of which is an outline number to an entry in an outline, and the second item to the right of it ...