Simple example using BernoulliNB (naive bayes classifier) scikit-learn in python - cannot explain classification

Using scikit-learn 0.10

Why does the following trivial code snippet:

from sklearn.naive_bayes import *

import sklearn
from sklearn.naive_bayes import *

print sklearn.__version__

X = np.array([ [1, 1, 1, 1, 1], 
               [0, 0, 0, 0, 0] ])
print "X: ", X
Y = np.array([ 1, 2 ])
print "Y: ", Y

clf = BernoulliNB()
clf.fit(X, Y)
print "Prediction:", clf.predict( [0, 0, 0, 0, 0] )    

Print out an answer of "1" ? Having trained the model on [0,0,0,0,0] => 2 I was expecting "2" as the answer.

And why does replacing Y with

Y = np.array([ 3, 2 ])

Give a different class "2" as an answer (the correct one) ? Isn't this just a class label?

Can someone shed some light on this?

Answers


By default, alpha, the smoothing parameter is one. As msw said, your training set is very small. Due to the smoothing, no information is left. If you set alpha to a very small value, you should see the result you expected.


Your training set is too small as can be shown by

clf.predict_proba(X)

which yields

array([[ 0.5,  0.5],
       [ 0.5,  0.5]])

which shows that the classifier views all classifications as equiprobable. Compare with the sample shown in the documentation for BernoulliNB for which predict_proba() yields:

array([[ 2.71828146,  1.00000008,  1.00000004,  1.00000002,  1.        ],
       [ 1.00000006,  2.7182802 ,  1.00000004,  1.00000042,  1.00000007],
       [ 1.00000003,  1.00000005,  2.71828149,  1.        ,  1.00000003],
       [ 1.00000371,  1.00000794,  1.00000008,  2.71824811,  1.00000068],
       [ 1.00000007,  1.0000028 ,  1.00000149,  2.71822455,  1.00001671],
       [ 1.        ,  1.00000007,  1.00000003,  1.00000027,  2.71828083]])

where I applied numpy.exp() to results to make them more readable. Obviously, the probabilities are not even close to equal and in fact well classify the training set.


Need Your Help

How long do I need to maintain old URLs?

php redirect http-status-code-301

Ok, I know it is best practice that once you change or remove a page URL, to redirect that url to the new relevant URL.

becomeFirstResponder not working in iOS 8

ios iphone ios7 ios8 ios8.1

I am using UITextField's method becomeFirstResponder to show the keyboard.