index
module library.classifier.create_model
Create a MaxEnt model.
Usage:
$ python create_model.py OPTIONS <VECTORS_FILE>
Options:
--mallet-dir DIR
Directory where Mallet lives, it should contain the bin subdirectory. The
default is set by the MALLET variable below.
--cross-validation N
There is no cross validation by default, set this to a number to turn
validation on, for example, with 5 you will get 5-fold cross validation.
--trainer 'MaxEnt'|'NaiveBayes'|'C45'|'DecisionTree'|...
One of the trainers allowed by Mallet, the Mallet API at
http://mallet.cs.umass.edu/api/ has a list somewhere in the
cs.mallet.classify package. The default is MaxEnt.
The vectors file is the text file created by create_vectors.py. By default, this
will use the MaxEnt classifier and perform no cross validation. Edit the TRAINER
and CROSS_VALIDATION variables to change this behaviour.
The script creates the following files:
<VECTORS_FILE>.model
<VECTORS_FILE>.err
<VECTORS_FILE>.out
<VECTORS_FILE>.vect
The first one is the one needed for running the classifier. Note that if cross
validation is on then there will be no .model file, instead there will be models
for each fold.
module functions
create_model(vectors, trainer='MaxEnt', cross_validation=None, output=False)
Create a Mallet model given a vectors file.
remove_files(vectors)
Remove all files derived from the vectors file.