logo
Free, unlimited AI code reviews that run on commit
git-lrc git-lrc GitHub Install Now We'd appreciate a star git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt git-lrc - Free, unlimited AI code reviews that run on commit | Product Hunt

norsnet - identifies unstructured loops from sequence

Author

       A. Schlessinger <avnersch@gmail.com>

Description

       NORSnet is a neural network based method that focuses on the identification of unstructured loops.

       NORSnet was trained to distinguish between very long contiguous segments with non-regular secondary
       structure (NORS regions) and well-folded proteins. NORSnet was trained on predicted information rather
       than on experimental data. Therefore, it was optimized on a large data, which is not biased by today's
       experimental means of capturing disorder. Thus, NORSnet reached into regions in sequence space that are
       not covered by the specialized disorder predictors. One disadvantage of this approach is that it is not
       optimal for the identification of the "average" disordered region.

   ConversionofPSI-BLASTalignmenttoHSSPformat
       The most up-to-date procedure can be found at
       <https://www.rostlab.org/owiki/index.php/How_to_generate_an_HSSP_file_from_alignment#Generating_an_HSSP_profile>.

       1. Convert BLAST output to a Single Alignment Format (SAF):
            /usr/share/librg-utils-perl/blast2saf.pl fasta=<query_fasta_file> maxAli=3000 eSaf=1 \
             saf=<saf_formatted_file> <blast_output>

       2. Convert SAF format to HSSP:
            /usr/share/librg-utils-perl/copf.pl <saf_formatted_file> formatIn=saf formatOut=hssp \
             fileOut=<hssp_formatted_file> exeConvertSeq=convert_seq

       3. Filter results to 80% redundancy:
            /usr/share/librg-utils-perl/hssp_filter.pl red=80 <hssp_formatted_file> fileOut=<filtered_hssp_formatted_file>

   OutputformatOutputmode1

       Tabular output, columns:

        pos            amino acid number (1..)
        res            residue 1-letter code
        node1          output of neural network node 1
        node2          output of neural network node 2
        pred           node1 / ( node1 + node2 )
        n40            pred < 0.40 ? '-' : 'N'
        n40fil         at least 31 AA long stretches of 'N' in n40
        n59            pred < 0.59 ? '-' : 'N'
        n59fil         at least 31 AA long stretches of 'N' in n59

       'N' is for non-regular secondary structure.

Environment

       NORSNET_ROOT
           Overrides /usr/share/norsnet, the path to helper scripts and data files.

Examples

        norsnet /usr/share/doc/norsnet/examples/cad23.f /usr/share/doc/norsnet/examples/cad23-fil.rdbProf /usr/share/doc/norsnet/examples/cad23-fil.hssp cad23.norsnet cad23 /usr/share/doc/norsnet/examples/cad23.profbval

Files

*.norsnet
           default output file extension

       /usr/share/doc/norsnet/examples
           default precomputed input files directory

Name

       norsnet - identifies unstructured loops from sequence

Notes

       1. It is recommended to create the profiles using 3 iteration of PSI-BLAST against big database
       2. It is also recommended to filter the hssp files using hssp_filter.pl from the Prof package using the
       following command: perl hssp_filter.pl hssp_file red=80

Options

       FASTA_FILE
           File containing protein amino-acid sequence in fasta format.

       RDBPROF_FILE
           Secondary structure and solvent accessibility prediction by PROF in rdb format.

       HSSP_FILE
           PSI-BLAST alignment profile file converted to HSSP format.

       OUTPUT_FILE
           The name of the final NORSnet output file.

       PROFBVAL_FILE
           Flexible/rigid residues prediction by profbval(1) in rdb format (mode 5).

       OUTPUT_MODE
           NORSnet can create output files in different formats for different purposes. Valid modes are `1', `2'
           or `3'. Default mode: 1.

           -   Default mode. Use this when you do not want to give a value here but you want to specify debug.

           1   for metadisorder(1)

       DEBUG
           Set to 1 for debugging messages

Output

       number -
           residue number

       residue -
           residue type

       raw -
           raw value of the different between the two output nodes

References

       Schlessinger, A., Liu, J., and Rost, B. (2007). Natively unstructured loops differ from other loops. PLoS
       Comput Biol, 3(7), e140.

See Also

profbval(1), prof(1).
       Main website
           <http://www.predictprotein.org/>

1.0.17                                             2022-01-18                                         NORSNET(1)

Synopsis

       norsnet <FASTA_FILE> <RDBPROF_FILE> <HSSP_FILE> <OUTPUT_FILE> <PROTEIN_NAME> <PROFBVAL_FILE>
       <OUTPUT_MODE> <DEBUG>

See Also