logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

mlpack_linear_svm - linear svm is an l2-regularized support vector machine.

Additional Information

       For further information, including relevant papers, citations,  and  theory,  consult  the  documentation
       found at http://www.mlpack.org or included with your distribution of mlpack.

mlpack-4.5.1                                     29 January 2025                            mlpack_linear_svm(1)

Description

       An implementation of linear SVMs that uses either L-BFGS or parallel SGD (stochastic gradient descent) to
       train the model.

       This  program allows loading a linear SVM model (via the '--input_model_file (-m)' parameter) or training
       a linear SVM model given training data (specified with the '--training_file  (-t)'  parameter),  or  both
       those  things  at once. In addition, this program allows classification on a test dataset (specified with
       the  '--test_file  (-T)'  parameter)  and  the   classification   results   may   be   saved   with   the
       '--predictions_file  (-P)'  output  parameter.  The  trained  linear  SVM  model  may  be saved using the
       '--output_model_file (-M)' output parameter.

       The training data, if specified,  may  have  class  labels  as  its  last  dimension.   Alternately,  the
       '--labels_file (-l)' parameter may be used to specify a separate vector of labels.

       When  a model is being trained, there are many options. L2 regularization (to prevent overfitting) can be
       specified with the '--lambda (-r)' option, and the number of classes can be manually specified  with  the
       '--num_classes  (-c)'and  if  an  intercept  term  is not desired in the model, the '--no_intercept (-N)'
       parameter can be specified.Margin of difference between correct class and other classes can be  specified
       with  the  '--delta  (-d)'  option.The  optimizer  used  to  train  the  model  can be specified with the
       '--optimizer (-O)' parameter. Available options are 'psgd' (parallel  stochastic  gradient  descent)  and
       'lbfgs'   (the   L-BFGS   optimizer).   There   are  also  various  parameters  for  the  optimizer;  the
       '--max_iterations  (-n)'  parameter  specifies  the  maximum  number  of  allowed  iterations,  and   the
       '--tolerance (-e)' parameter specifies the tolerance for convergence. For the parallel SGD optimizer, the
       ’--step_size  (-a)'  parameter  controls  the  step size taken at each iteration by the optimizer and the
       maximum number of epochs (specified with '--epochs (-E)'). If the objective function  for  your  data  is
       oscillating  between  Inf  and  0, the step size is probably too large. There are more parameters for the
       optimizers, but the C++ interface must be used to access these.

       Optionally, the model can be  used  to  predict  the  labels  for  another  matrix  of  data  points,  if
       '--test_file  (-T)'  is  specified.  The  '--test_file  (-T)'  parameter  can  be  specified  without the
       '--training_file  (-t)'  parameter,  so  long  as  an  existing  linear  SVM  model  is  given  with  the
       '--input_model_file  (-m)'  parameter. The output predictions from the linear SVM model may be saved with
       the '--predictions_file (-P)' parameter.

       As an example, to train a  LinaerSVM  on  the  data  ''data.csv''  with  labels  ’'labels.csv''  with  L2
       regularization of 0.1, saving the model to ’'lsvm_model.bin'', the following command may be used:

       $   mlpack_linear_svm--training_file   data.csv   --labels_file  labels.csv  --lambda  0.1  --delta  1
       --num_classes 0 --output_model_file lsvm_model.bin

       Then, to use that model to predict classes for the dataset ''test.csv'', storing the  output  predictions
       in ''predictions.csv'', the following command may be used:

       $   mlpack_linear_svm--input_model_file   lsvm_model.bin   --test_file   test.csv   --predictions_file
       predictions.csv

Name

mlpack_linear_svm - linear svm is an l2-regularized support vector machine.

Optional Input Options

--delta(-d)[double]
              Margin of difference between correct class and other classes. Default value 1.

       --epochs(-E)[int]
              Maximum number of full epochs over dataset for psgd Default value 50.

       --help(-h)[bool]
              Default help info.

       --info[string]
              Print help on a specific option. Default value ''.

       --input_model_file(-m)[unknown]
              Existing model (parameters).  --labels_file (-l) [unknown] A matrix containing labels (0 or 1) for
              the points in the training set (y).

       --lambda(-r)[double]
              L2-regularization parameter for training.  Default value 0.0001.

       --max_iterations(-n)[int]
              Maximum iterations for optimizer (0 indicates no limit). Default value 10000.

       --no_intercept(-N)[bool]
              Do not add the intercept term to the model.

       --num_classes(-c)[int]
              Number of classes for classification; if unspecified (or 0), the number of classes  found  in  the
              labels will be used. Default value 0.

       --optimizer(-O)[string]
              Optimizer to use for training ('lbfgs' or 'psgd'). Default value 'lbfgs'.

       --seed(-s)[int]
              Random seed. If 0, 'std::time(NULL)' is used.  Default value 0.

       --shuffle(-S)[bool]
              Don't shuffle the order in which data points are visited for parallel SGD.

       --step_size(-a)[double]
              Step size for parallel SGD optimizer. Default value 0.01.

       --test_file(-T)[unknown]
              Matrix containing test dataset.

       --test_labels_file(-L)[unknown]
              Matrix containing test labels.

       --tolerance(-e)[double]
              Convergence tolerance for optimizer. Default value 1e-10.

       --training_file(-t)[unknown]
              A matrix containing the training set (the matrix of predictors, X).

       --verbose(-v)[bool]
              Display informational messages and the full list of parameters and timers at the end of execution.

       --version(-V)[bool]
              Display the version of mlpack.

Optional Output Options

--output_model_file(-M)[unknown]
              Output for trained linear svm model.

       --predictions_file(-P)[unknown]
              If test data is specified, this matrix is where the predictions for the test set will be saved.

       --probabilities_file(-p)[unknown]
              If  test  data is specified, this matrix is where the class probabilities for the test set will be
              saved.

Synopsis

mlpack_linear_svm [-ddouble] [-Eint] [-munknown] [-lunknown] [-rdouble] [-nint] [-Nbool] [-cint] [-Ostring] [-sint] [-Sbool] [-adouble] [-Tunknown] [-Lunknown] [-edouble] [-tunknown] [-Vbool] [-Munknown] [-Punknown] [-punknown] [-h-v]

See Also