Cut-offs

» SignalP

SignalP 4.1 predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes.

This tool returns the probability of a protein of being secreted by a percentage value. The cut-off selected in the webpage makes that all the proteins with a smaller percentage of being secreted are removed from the pipeline. By default it is 80% (= 0.8)

» TargetP

TargetP 1.1 predicts the subcellular location of eukaryotic proteins. The location assignment is based on the predicted presence of any of the N-terminal presequences: chloroplast transit peptide (cTP), mitochondrial targeting peptide (mTP) or secretory pathway signal peptide (SP).

As in the case of SignalP, this tool returns the probability of a protein of being secreted by a percentage value. The cut-off selected in the webpage makes that all the proteins with a smaller percentage of being secreted are removed from the pipeline. By default it is 80% (= 0.8)

» TMHMM

TMHMM is a membrane protein topology prediction method based on a hidden Markov model which predicts transmembrane helices and discriminate between soluble and membrane proteins with high degree of accuracy.

This tool returns the number of transmembrane domains of a protein. The cut-off selected in the webpage remove from the pipeline all the proteins with more domains than the defined value. By default, the maximum number of domains allowed is 1.

» PSORTb

PSORTb examines a given protein sequence for amino acid composition, similarity to proteins of known localization, presence of a signal peptide, transmembrane alpha-helices and motifs corresponding to specific localizations in order to predict bacterial protein subcellular localizations.

This tool returns the subcellular location of a protein and a score that indicates the confidence values for each of the localization sites given. The cut-off selected in the webpage makes that all the proteins with a score smaller than 7.5 in the case of Extracellular location are removed from the pipeline.

» WoLF PSORT

WoLF PSORT is an extension of the PSORTb program for protein subcellular location prediction, designed to carry out this analysis into eukaryotic cells . WoLF PSORT converts protein amino acid sequences into numerical localization features; based on sorting signals, amino acid composition and functional motifs such as DNA-binding motifs.

This tool returns the subcellular location of a protein and a score that indicates the confidence values for each of the given localization sites. The cut-off selected in the webpage makes that all the proteins with a score smaller than 14 in the case of Extracellular location are removed from the pipeline.

» BLASTp

The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.

This tool returns the regions of local similarity between the proteins predicted as secreted and the available databases. The cut-off selected in the webpage remove from the results all the sequences with an expectation value (e-value) smaller than the indicated (1e-05).

» HMMER

HMMER is a commonly used software package for sequence analysis. Its general usage is to identify homologous protein or nucleotide sequences. It does this by comparing a profile-HMM to either a single sequence or a database of sequences. Sequences that score significantly better to the profile-HMM compared to a null model are considered to be homologous to the sequences that were used to construct the profile-HMM.

In the case of this tool, all the results obtained from the analysis are returned to the user, so it is no necessary to establish a cut-off value.

» topGO

topGO (topology-based Gene Ontology scoring) is a software package for calculating the significance of biological terms from gene expression data. It implements various standard and advanced new algorithms for determining the relevance of Gene Ontology groups from microarrays.

To perform this analysis is necessary to determine which proteins from the predicted as secreted are enriched. To establish this is required to define a cut-off value (a score in the cases of Table of Counts and a p-value in the cases of Differential Expression tables). By default are considered as enriched the proteins with 2 or more reads or with a p-value smaller than 0.01.