Personal tools

PSI-Blast on the web

Help on our implementation of PSI-Blast

Installed PSI-Blast

We have currently installed the version of Psi-Blast :BLASTP 2.2.15 [Oct-15-2006]
The reference for this tool is:

Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402.

The official Psi-Blast man pages are available on this link.


Help for the Bioinformatica Blast

Explanations for the form options

  1. Search Title
    • Here you can select the kind of databases you want to be listed: all of them or a specific subset
  2. DB name
    • Select the database you want to run the search against. Only one choice is possible.
  3. Matrix
      • Use this option to set which comparison matrix should be used when searching the database. The default matrix for a protein blast is blosum62. You may choose from a complete list of matrices which should cover various evolutionary constraints.
        Default Matrices:
      1. blastp: blosum62
      2. blastx: blosum62
      3. blastn: DNA Identity Matrix
  4. Expected Threshold
      • The expected threshold establishes a statistical significance threshold for reporting database sequence matches. The default value is 10, meaning that 10 matches are expected to be found merely by chance. Lower expected thresholds are more stringent, leading to fewer chance matches being reported. Increasing the expected threshold shows less stringent matches and is recommended when you are performing searches with short sequences as a short query is more likely to occur by chance in the database than a longer one, so even a perfect match (no gaps) can have low statistical significance and may not be reported. Increasing the Expected threshold allows you to look farther down in the hit list and see matches that would normally be discarded because of low statistical significance. Generally a value of up to 1000 is enough to see results.
  5. Drop off
      • This is the amount a score must drop before extension of word hits is halted.
  6. Final Dropoff
      • This parameter controls the dropoff for the final reported alignment.
  7. Expected Threshold Multi
      • e-value threshold for inclusion in multipass model.
  8. Iterations
      • Maximum number of passes to use.
  9. Opengap & Extendedgap
      • The gap open penalty is the score taken away for the initiation of the gap in sequence or in structure. To make the match more significant you can try to make the gap penalty larger. It will decrease the number of gaps and if you have good alignment without many gaps, its Z-score will be higher. The gap extension penalty is added to the standard gap open penalty for each base or residue in the gap. This is how long gaps are penalised. If you don't like long gaps, just increase the extension gap penalty. Usually you will expect a few long gaps rather than many short gaps, so the gap extension penalty should be lower than the gap penalty. An exception is where one or both sequences are single reads with possible sequencing errors in which case you would expect many single base gaps. You can get this result by setting the gap open penalty to zero (or very low) and using the gap extension penalty to control gap scoring.
      • Note: If you are looking for an open gap value that is not shown in the menu, change the extend gap, as a different range of open gap values will be displayed.If you change the extend gap value, your open gap value may be lost as a new range of open gaps is likely to be displayed.
  10. Filter
      • The filter option, if set to true, will allow you to mask out various segments of the query sequence for regions which are non-specific for sequence similarity searches. Filtering can eliminate statistically significant but biologically uninteresting reports from the output, for example hits against common acidic-, basic- or proline-rich regions, leaving the more biologically interesting regions of the query sequence available for specific matching against database sequences. Filtering is only applied to the query sequence, not to database sequences. The program used for this, with nucleotide query sequences is known as DUST written by Tatusov, R. L., and Lipman, D.J. The SEG program is used for filtering low complexity regions in amino acid sequences from your protein query sequence and was written by Wootton, J.C., and Federhen, S.
  11. Scores
      • Setting this option to any number available in the menu allows you to set to maximum number of reported scores in the output file. This is the -v option of the Blast command line.
  12. Alignments
      •  Select the number of alignments you want to see displayed in the ouptut file. This is the -b option of the blast command line.
  13. Align Views
      • pairwise
        Aligns your query sequence and database matches in pairs. Matches are connected with a "|" symbol. Mismatches are opposed with a spce. Gaps are introduced with a "-" symbol.
      • M/S with identities
        The databases alignments are anchored (shown in relation to) to your query sequence.
        Identities are displayed as dots (.).
        Mismatches are displayed as single letter nucleotide abbreviations(c,t,a or g).
        Gaps are introduced with a "-" symbol.
      • M/S without identities
        The databases alignments are anchored (shown in relation to) to your query sequence.
        Identities are shown as single letter nucleotide abbreviations.
        Mismatches displayed as single letter nucleotide abbreviations(c,t,a or g).
        Gaps are introduced with a "-" symbol.
      • Flat Query-anchored with identities
        The 'flat' display shows inserts as deletions on the query.
        Identities are displayed as dots.
        Mismatches displayed as single letter nucleotide abbreviations (c,t,a or g).
        Gaps are introduced with a "-" symbol.
      • Flat Query-anchored without identities
        The 'flat' display shows inserts as deletions on the query.
        Identities are displayed as as single letter nucleotide abbreviations (c,t,a or g).
        Mismatches displayed as single letter nucleotide abbreviations (c,t,a or g).
        Gaps are introduced with a "-" symbol.
  14. Sequence
      • You can cut and paste or type a nucleotide or protein sequence into the text window. The only accepted format is a FASTA format. This format contains a one line header followed by lines of sequence data. The first line starts with a " >" symbol and is followed by the name of the sequence. The rest of the line is a description of the sequence (optional). The sequence itself is written in the remaining lines. Blanks lines , spaces are ignored. You can input more than one sequence.
      • All sequences must follow the IUB/IUPAC standard codes
      • All sequences will be checked before running Blast, and errors will be pointed. Please correct them before resubmitting  the sequence

    And run PSI-Blast ! The next page will show a small alert window, which will provide you with a JobId. If you know the job will take a long time to execute, you can save this JobId (copy it), click on the exit button, and come back later to view the PSI-Blast Results on the Web


Powered by Plone, the Open Source Content Management System