-a <n> or -a <string>
determines the classification algorithm.
Possible values are:
0 or IB
the IB1 (k‐NN) algorithm (default)
1 or IGTREE
a decision‐tree‐based approximation of IB1
2 or TRIBL
a hybrid of IB1 and IGTREE
3 or IB2
an incremental editing version of IB1
4 or TRIBL2
a non‐parameteric version of TRIBL
-b n
number of lines used for bootstrapping (IB2 only)
-B n
number of bins used for discretization of numeric feature values (Default B=20)
--Beam=<n>
limit +v db output to n highest‐vote classes
--clones=<n>
number f threads to use for parallel testing
-c n
clipping frequency for prestoring MVDM matrices
+D
store distributions on all nodes (necessary for using +v db with IGTree, but wastes memory
otherwise)
--Diversify
rescale weight (see docs)
-d val
weigh neighbors as function of their distance:
Z : equal weights to all (default)
ID : Inverse Distance
IL : Inverse Linear
ED:a : Exponential Decay with factor a (no whitespace!)
ED:a:b : Exponential Decay with factor a and b (no whitespace!)
-e n
estimate time until n patterns tested
-f file
read from data file 'file' OR use filenames from 'file' for cross validation test
-F format
assume the specified input format (Compact, C4.5, ARFF, Columns, Binary, Sparse )
-G normalization
normalize distributions (+v db option only)
Supported normalizations are:
Probability or 0
normalize between 0 and 1
addFactor:<f> or 1:<f>
add f to all possible targets, then normalize between 0 and 1 (default f=1.0).
logProbability or 2
Add 1 to the target Weight, take the 10Log and then normalize between 0 and 1
+H or -H
write hashed trees (default +H)
-i file
read the InstanceBase from 'file' (skips phase 1 & 2 )
-I file
dump the InstanceBase in 'file'
-k n
search 'n' nearest neighbors (default n = 1)
-L n
set value frequency threshold to back off from MVDM to Overlap at level n
-l n
fixed feature value length (Compact format only)
-m string
use feature metrics as specified in 'string':
The format is : GlobalMetric:MetricRange:MetricRange
e.g.: mO:N3:I2,5-7
C: cosine distance. (Global only. numeric features implied)
D: dot product. (Global only. numeric features implied)
DC: Dice coefficient
O: weighted overlap (default)
E: Euclidian distance
L: Levenshtein distance
M: modified value difference
J: Jeffrey divergence
S: Jensen‐Shannon divergence
N: numeric values
I: Ignore named values
--matrixin=file
read ValueDifference Matrices from file 'file'
--matrixout=file
store ValueDifference Matrices in 'file'
-n file
create a C4.5-style names file 'file'
-M n
size of MaxBests Array
-N n
number of features (default 2500)
-o s
use s as output filename
--occurrences=<value>
The input file contains occurrence counts (at the last position) value can be one of: train , test
or both-O path
save output using 'path'
-p n
show progress every n lines (default p = 100,000)
-P path
read data using 'path'
-q n
set TRIBL threshold at level n
-R n
solve ties at random with seed n
-s
use the exemplar weights from the input file
-s0
ignore the exemplar weights from the input file
-T n
use feature n as the class label. (default: the last feature)
-t file
test using 'file'
-t leave_one_out
test with the leave‐one‐out testing regimen (IB1 only). you may add --sloppy to speed up leave‐
one‐out testing (but see docs)
-t cross_validate
perform cross‐validation test (IB1 only)
-t @file
test using files and options described in 'file' Supported options: d e F k m o p q R t u v w x %
-
--Treeorder=value n
ordering of the Tree:
DO: none
GRO: using GainRatio
IGO: using InformationGain
1/V: using 1/# of Values
G/V: using GainRatio/# of Valuess
I/V: using InfoGain/# of Valuess
X2O: using X‐square
X/V: using X‐square/# of Values
SVO: using Shared Variance
S/V: using Shared Variance/# of Values
GxE: using GainRatio * SplitInfo
IxE: using InformationGain * SplitInfo
1/S: using 1/SplitInfo
-u file
read value‐class probabilities from 'file'
-U file
save value‐class probabilities in 'file'
-V
Show VERSION
+v level or -v level
set or unset verbosity level, where level is:
s: work silently
o: show all options set
b: show node/branch count and branching factor
f: show calculated feature weights (default)
p: show value difference matrices
e: show exact matches
as: show advanced statistics (memory consuming)
cm: show confusion matrix (implies +vas)
cs: show per‐class statistics (implies +vas)
cf: add confidence to output file (needs -G)
di: add distance to output file
db: add distribution of best matched to output file
md: add matching depth to output file.
k: add a summary for all k neigbors to output file (sets -x)
n: add nearest neigbors to output file (sets -x)
You may combine levels using '+' e.g. +v p+db or -v o+di
-w n
weighting
0 or nw: no weighting
1 or gr: weigh using gain ratio (default)
2 or ig: weigh using information gain
3 or x2: weigh using the chi‐square statistic
4 or sv: weigh using the shared variance statistic
5 or sd: weigh using standard deviation. (all features must be numeric)
-w file
read weights from 'file'
-w file:n
read weight n from 'file'
-W file
calculate and save all weights in 'file'
+% or -%
do or don't save test result (%) to file
+x or -x
do or don't use the exact match shortcut
(IB1 and IB2 only, default is -x)
-X file
dump the InstanceBase as XML in 'file'