logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

mlpack_dbscan - dbscan clustering

Additional Information

       For  further  information,  including  relevant  papers, citations, and theory, consult the documentation
       found at http://www.mlpack.org or included with your distribution of mlpack.

mlpack-4.5.1                                     29 January 2025                                mlpack_dbscan(1)

Description

       This  program  implements  the DBSCAN algorithm for clustering using accelerated tree-based range search.
       The type of tree that is used may be parameterized, or brute-force range search may also be used.

       The input dataset to be clustered may be specified with the '--input_file (-i)' parameter; the radius  of
       each range search may be specified with the ’--epsilon (-e)' parameters, and the minimum number of points
       in a cluster may be specified with the '--min_size (-m)' parameter.

       The  '--assignments_file  (-a)'  and  '--centroids_file  (-C)'  output parameters may be used to save the
       output of the clustering. '--assignments_file (-a)' contains the cluster assignments of each  point,  and
       '--centroids_file (-C)' contains the centroids of each cluster.

       The  range search may be controlled with the '--tree_type (-t)', '--single_mode (-S)', and '--naive (-N)'
       parameters. '--tree_type (-t)' can control the type of tree used  for  range  search;  this  can  take  a
       variety  of  values: 'kd', 'r', ’r-star', 'x', 'hilbert-r', 'r-plus', 'r-plus-plus', 'cover', 'ball'. The
       ’--single_mode (-S)' parameter will force  single-tree  search  (as  opposed  to  the  default  dual-tree
       search), and ''--naive (-N)' will force brute-force range search.

       An  example  usage to run DBSCAN on the dataset in 'input.csv' with a radius of 0.5 and a minimum cluster
       size of 5 is given below:

       $ mlpack_dbscan--input_file input.csv --epsilon 0.5 --min_size 5

Name

mlpack_dbscan - dbscan clustering

Optional Input Options

--epsilon(-e)[double]
              Radius of each range search. Default value 1.

       --help(-h)[bool]
              Default help info.

       --info[string]
              Print help on a specific option. Default value ''.

       --min_size(-m)[int]
              Minimum number of points for a cluster. Default value 5.

       --naive(-N)[bool]
              If set, brute-force range search (not tree-based) will be used.

       --selection_type(-s)[string]
              If using point selection policy, the type of selection to use ('ordered', 'random'). Default value
              'ordered'.

       --single_mode(-S)[bool]
              If set, single-tree range search (not dual-tree) will be used.

       --tree_type(-t)[string]
              If using single-tree or dual-tree search, the type of tree  to  use  ('kd',  'r',  'r-star',  'x',
              'hilbert-r', 'r-plus', 'r-plus-plus', 'cover', 'ball'). Default value 'kd'.

       --verbose(-v)[bool]
              Display informational messages and the full list of parameters and timers at the end of execution.

       --version(-V)[bool]
              Display the version of mlpack.

Optional Output Options

--assignments_file(-a)[unknown]
              Output matrix for assignments of each point.

       --centroids_file(-C)[unknown]
              Matrix to save output centroids to.

Required Input Options

--input_file(-i)[unknown]
              Input dataset to cluster.

Synopsis

mlpack_dbscan-iunknown [-edouble] [-mint] [-Nbool] [-sstring] [-Sbool] [-tstring] [-Vbool] [-aunknown] [-Cunknown] [-h-v]

See Also