logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

FreeContact - fast protein contact predictor

Author

       Laszlo Kajan, <lkajan@rostlab.org>

Description

       FreeContact is a protein residue contact predictor optimized for speed.  Its input is a multiple sequence
       alignment. FreeContact can function as an accelerated drop-in for the published contact predictors
       EVfold-mfDCA of DS. Marks (2011) and PSICOV of D. Jones (2011).  FreeContact is accelerated by a
       combination of vector instructions, multiple threads, and faster implementation of key parts.  Depending
       on the alignment, 10-fold or higher speedups are possible.

       A sufficiently large alignment is required for meaningful results.  As a minimum, an alignment with an
       effective (after-weighting) sequence count bigger than the length of the query sequence should be used.
       Alignments with tens of thousands of (effective) sequences are considered good input.

       jackhmmer(1) from the hmmer package, or hhblits(1) from hhsuite can be used to generate the alignments,
       for example.

Freecontact

EXPORT_OKget_ps_evfold()
           Get parameters for EVfold-mfDCA operating mode.

       get_ps_psicov()
           Get parameters for PSICOV 'improved results' operating mode.

       get_ps_psicov_sd()
           Get  parameters  for  PSICOV  'sensible  default'  operating mode. This is much faster than 'improved
           results' for a slight loss of precision.

           These get_ps_() functions return a hash of arguments (clustpc => num,...,rho => num) that can be used
           with get_seq_weights(), run() or run_with_seq_weights(). The arguments correspond  to  the  published
           parametrization of the respective method.

Freecontact::Predictor

Constructor
       new( dbg => bool )
           Creates an "FreeContact::Predictor".

   Methodsget_seq_weights()
           Defaults for the arguments are obtained with get_ps_evfold().

       run(ali => [], clustpc => dbl, density => dbl, gapth => dbl, mincontsep => uint, pseudocnt => dbl,
       pscnt_weight => dbl, estimate_ivcov => bool, shrink_lambda => dbl, cov20 => bool, apply_gapth => bool,
       rho => dbl, [veczw => bool], [num_threads => int], [icme_timeout => int], [timing => {}])
           Defaults for the arguments are obtained with get_ps_evfold().

           ali Reference  to array holding alignment rows as strings. The first row must hold the query, with no
               gaps.

           clustpc
               BLOSUM-style clustering similarity threshold [0-1].

           icme_timeout
               Inverse covariance matrix estimation timeout in seconds. Default: 1800.

               The estimation sometimes gets stuck. If the timeout  is  reached,  the  run()  method  dies  with
               "Caught  FreeContact  timeout  exception:  ...".  You  can  catch this exception and handle it as
               needed, e.g. by setting a higher rho value.

           num_threads
               Number of OpenMP threads to use. If unset, all CPUs are used.

           timing
               If given, this hash reference is filled  with  data  containing  wall  clock  timing  results  in
               seconds:

                 {
                   num_threads =>  NUM,
                   seqw =>         NUM,
                   pairfreq =>     NUM,
                   shrink =>       NUM,
                   inv =>          NUM,
                   all =>          NUM
                 }

           run() returns a hash reference of contact prediction results:

             {
               fro => [  # identifier of scoring scheme
                 [
                   I,    # 0-based index of amino acid i
                   J,    # 0-based index of amino acid j
                   SCORE # contact score
                 ], ...
               ],
               MI => ...,
               l1norm => ...
             }

           Use 'fro' scores with EVfold.

Name

       FreeContact - fast protein contact predictor

Synopsis

         use FreeContact;

         open(EXAMPLE, '<', '/usr/share/doc/libfreecontact-perl/examples/demo_1000.aln') || confess($!);
         my @aln = <EXAMPLE>; chomp(@aln); close(EXAMPLE);

         my $contacts = FreeContact::Predictor->new()->run(ali => \@aln);

         my $predictor = FreeContact::Predictor->new();
         my %parset = FreeContact::get_ps_evfold();
         my $contacts = $predictor->run(ali => \@aln, %parset, num_threads => 1);

         my $predictor = FreeContact::Predictor->new();
         my($aliw, $wtot) = $predictor->get_seq_weights(ali => \@aln, num_threads => 1);
         my $contacts = $predictor->run_with_seq_weights(ali => \@aln, aliw => $aliw, wtot => $wtot, num_threads => 1);

See Also