SECRETOOL FAQS

 

Q: Why is SECRETOOL specific for fungi?

A: At some extent other eukaryotic proteomes (except plants) may be screened by this tool with an acceptable performance, but we have tested it on 70 different fungal proteomes to calibrate its performance, so no insight can be offered here for other organisms different from fungi.

 

Q: How long does a full Classical Secretion Analysis take?

A: We have elaborated a table on the orientative elapsed analysis times for different number of sequences processed. This may vary depending on the amount of sequences that make it to the HMMer and BLAST analyses (which are the most time consuming steps).

Optional Analysis

100 sequences

Elapsed time (hrs)

1000 sequences

Elapsed time (hrs)

10000 sequences

Elapsed time (hrs)

-With BLASTp

& HMMer

00:03:38

00:16:18

02:49:22

-With BLASTp

00:00:57

00:11:43

02:14:00

-With HMMer

00:02:56

00:06:10

00:46:13

-No BLASTp

& No HMMer

00:00:23

00:03:21

00:41:05


 

Q: When I submit a FASTA file to SECRETOOL it does not work...why?

A: SECRETOOL recognizes the following extensions (.fasta; .fa; .faa; .fas; .fna; .sequences; .txt). In case your FASTA is named under any other extension, just rename it to the fore mentioned extensions and it should work (unless it does not comply with the requirements mentioned in the next point).

 

Q: Why FASTA headers must be 20 characters at maximum?

A: Some tools on our pipeline require a maximum length of 20 characters in the header. If it is not so those tools will rename the sequences, which may cause the crash of the pipeline in following steps due to misidentification of the sequence if the sequence is not clearly identified by those 20 characters.

 

Q: Why only Joint Genome Institute, BROAD Institute and NCBI repositories FASTA files are renamed by your re-header tool?

A: We do not only accept those headers, but headers that follow the guidelines of those institutes, e.g.

> jgi|Sphst1|43501|fgenesh1_pg.10_#_1

> gi|383872198|tpg|DAA35002.1| TPA_exp: alpha-xylosidase [Aspergillus niger ATCC 1015]

>VDAG_00001T0 | VDAG_00001 | Verticillium dahliae VdLs.17 predicted protein (262 aa)

>AB00001.1

On the other hand, naming a protein multi FASTA as in the following header format is not correct if you want to use SECRETOOL, since difference between proteins will not be defined in the first 20 characters.

>12939.Fgenesh_sirp_germ_KB872397.4_predicted_protein_6

>12939.Fgenesh_sirp_germ_KB872397.4_predicted_protein_5

 

Q: What does the FASTA file obtained as result contain?

A: It contains the sequences of the predicted secretory proteins.

 

Q: How do I interpret the BLAST output from the results?

A: For each query sequence the results are displayed in a custom output with the following fields:

query accession: Name of the sequence in our query FASTA

subject accession: Name of the targeted sequence in the BLAST database

e-value: Hit probability

score: BLAST score of the hit

subject title: Putative function of the targeted sequence

 

Q: What is the meaning of the fields in the HMMer 3 output ?

A: For a complete and accurate description of the results you can go here.

 

Q: What is the aim of FASTA Splitter tool?

A: it is a very helpful tool that allows users to split their FASTA files into smaller fixed-size (relative to the number of sequences included) files. it is really helpful if a user plans to perform some analysis through any tool with a limited number of input sequences accepted.

 

Q: What is the input list correct format for Sequence retriever?

A: The correct input list must have the same sequence ids present in the big FASTA file. e.g.:

For example, if headers had this format:

>VDAG_00001T0 | VDAG_00001 | Verticillium dahliae VdLs.17 predicted protein (262 aa)

The ids on the list should have the following format, and and appear one per line, e.g.:

VDAG_00001T0

VDAG_00002T0

VDAG_00003T0

 

Q: Why to use your independent tools instead use them in their native websites?

A: We do not intend to replace the original tools here, the only reason we include those tools in our web is to have all the resources together and to skip the sequence number restrictions (maximum number of sequences allowed as input) thus allowing users to make a full proteome analysis in a single step. Apart from this, the tools presented here are limited to secreted protein predictions (other subcellular locations are not elegible).

 

Q: What PredGPI website output does the PredGPI parser use for filtering the results?

A: PredGPI parser parses files in the format obtained by clicking "Download" when submitting data to PredGPI. Copy/Paste of results displayed on the screen (selecting the "Display" option in PredGPI will not be processed by our tool).

 

If you have any question that is not included among this FAQS, please contact us at gap@cicbiogune.es