BLAST Server

This page provides BLAST search access to all the Dictyostelium discoideum sequence information (shotgun reads, contigs, ESTs, GenBank submissions, gene model translations) gained so far by the scientific community. Users should bear in mind that shotgun reads and unfinished contigs are preliminary data and may contain errors and contaminating sequences from other species, e.g. E. coli.


Since assembly and annotation of contigs is a difficult task users are advised to follow the restrictions for the use of contig sequence data.



query sequence:
mask nt sequence for
low-complexity regions
:
BLAST method:
database:
expect threshold:



Descriptions

View original BLAST documentation at NCBI.


Use of Contig Sequence Data

The results of the contig annotations are confidential. In case you wish to use any of the contig annotation results in a publication, please contact us beforehand. Please bear in mind that the current contig annotations are still preliminary. In addition contig sequences represent unfinished sequence data that might contain errors (misinterpreted bases or indel events) though most of the ambiguities have been identified and resolved in the assembly process.


Query Sequence

Enter your sequence as plain text or in fastA format.
For the
BLAST methods BLASTN, BLASTX, and TBLASTX the query sequence should be nucleotide, for the BLAST methods BLASTP and TBLASTN you should enter a protein sequence.


Mask nt Query Sequence

You may choose to mask the query sequence at repetitive and low-complexity regions. Currently, this is done with RepeatMasker (Smit,AFA & Green,P at
http://www.repeatmasker.org).


BLAST method

This interface uses the BLAST program palette developed at
WashU.

- auto -     automatically chooses the right program according to the detected query sequence type and sequence type of the selected BLAST database. Only TBLASTX has to be set manually.
BLASTN     compares a nucleotide query sequence against the entries in a nucleotide sequence database
BLASTX     compares the all-frame translations of a nucleotide query sequence against the entries in a protein sequence database
TBLASTX     compares the all-frame translations of a nucleotide query sequence against the all-frame translations of the entries in a nucleotide sequence database
TBLASTN     compares a protein query sequence against the all-frame translations of the entries in a nucleotide sequence database
BLASTP     compares an amino acid query sequence against the entries in a protein sequence database


Databases

The total Dictyostelium discoideum nucleotide BLAST database has been built up from several sequence sources which are:

Baylor     The Dictyostelium discoideum project at Baylor College of Medicine (Houston, Texas) produced shotgun reads for chromosome 6 (sequence identifier scheme 'IIA...') and chromosome 4/5 (sequence identifier scheme 'IIC...'). Additionally, there are shotgun reads derived from 15 YACs mapped to chromosome 6 with a reliability of ~40 % (sequence identifier scheme 'IIB...').
cDNA Project Japan     In the course of the Japanese cDNA project cDNA project > 50,000 cDNA clones have been sequenced since 1996.
GenBank     Dictyostelium discoideum nucleotide sequences deposited in the GenBank / EMBL / DDBJ consortium databases. Sequence identifiers have the form 'gi|...'. They sum up to ~1200 sequence entries neglecting submissions from the cDNA project in Japan (see above).
GSCJ     Sequence reads that origin from the German part of the Dictyostelium discoideum Genome Project, here at the Dept. Genome Analysis, IMB Jena.
Moreover, we provide access to preliminary contig consensus sequences (cmp. disclaimer) and derived gene models via this gateway. Look elsewhere for a more detailed project information and a description of the sequence libraries, or sequence download offers (including sequence statistics).
Sanger Centre     Sequences from the Dictyostelium discoideum genome project at the Sanger Centre (Hinxton Hall, UK). Some test reads from a chromosome 2 library and a chromosome 6 library are available. Sequence identifiers have the form 'sdic...'. Folks at the Sanger centre have also produced a whole genome assembly (WGA) from all the available trace data generated by the international consortium. Sequence identifiers here have the form 'Contig_xxxx'.
mtDNA     Complete sequence provided by A. Kuspa (Baylor College of Medicine, Houston, Texas).
rDNA     Preliminary sequence fragments provided by A. Kuspa (Baylor College of Medicine, Houston, Texas).


Mask Low-Complexity Regions

Setting this switch the program masks the query sequence for low-complexity regions. The masking is done with the program 'RepeatMasker'


Expect Threshold

The statistical significance threshold for reporting matches against database sequences. The expect value given for each HSP in the BLAST report specifies the expectancy value for the number of database entries to be found merely by chance, according to the stochastic model of
Karlin and Altschul (1990). If the statistical significance ascribed to a match is higher than the expect threshold, the match will not be reported. Lower expect thresholds are more stringent, leading to fewer chance matches being reported. Fractional values are acceptable.
Though you may extend the number of matching database entries by rising the expect threshold the maximum total number of entries that's reported is restricted to 120 (the best matching entries are displayed). Beside that, the threshold value is restricted to a maximum of 5.0e-4. A higher value won't add any meaningful hit to your search result (believe it).


Literature

Altschul, Stephen F., Warren Gish, Webb Miller, Eugene W. Myers, and David J. Lipman. Basic local alignment search tool. J. Mol. Biol. 215, 403-410 (1990).

Karlin, Samuel and Stephen F. Altschul (1990). Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Natl. Acad. Sci. USA 87:2264-68.