High-Performance Computing at the NIH

RSS Feed
Scientific Databases
Sorted by Database Name /type /format/center

Database Type Available formats Location on the Helix Systems Last updated

1000 Genomes
20100804 release containing analysis results sets (vcfs) and README files.
Nuc vcf files /fdb/1000genomes/

01 Apr 2013

(Updated occasionally
Source:NCBI.)

Alignment BAM /fdb/1000genomes/ftp/data/
20 May 2013

(Updated occasionally
Source:NCBI)


Cambridge Structural Database
Crystal structure information for over 165,000 organic and organometallic compounds. More info at CCDC.
3-D CSD /local/csd
Quest
16 Jan 2013

(Updated every 3 months
Source:CCDC)


Chicken Genome
May 2006 assembly from WUSTL.
Nuc MySQL NIH mirror of UCSC Genome Browser

Also available for direct MySQL queries from the Biowulf cluster nodes.
05 Apr 2013

(Updated weekly
Source:UCSC .)

Prot MySQL NIH mirror of UCSC Genome Browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
05 Apr 2013

(Updated weekly
Source:UCSC )

Annotations MySQL NIH mirror of UCSC Genome Browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
05 Apr 2013

(Updated weekly
Source:UCSC )


Cow Genome
Aug 2006 assembly from the Baylor Sequencing Center
Nuc MySQL NIH mirror of UCSC genome browser

Also available for direct MySQL queries from the Biowulf cluster nodes.
31 Jan 2012

(Updated weekly
Source:UCSC .)

Prot MySQL NIH mirror of UCSC genome browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
31 Jan 2012

(Updated weekly
Source:UCSC )

Annotations MySQL NIH mirror of UCSC genome browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
31 Jan 2012

(Updated weekly
Source:UCSC )


Dog Genome
May 2005 assembly from the Broad Institute
Nuc MySQL NIH mirror of UCSC genome browser

Also available for direct MySQL queries from the Biowulf cluster nodes.
25 Jan 2013

(Updated weekly
Source:UCSC .)

Prot MySQL NIH mirror of UCSC genome browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
25 Jan 2013

(Updated weekly
Source:UCSC )

Annotations MySQL NIH mirror of UCSC genome browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
25 Jan 2013

(Updated weekly
Source:UCSC )


Drosophila
Drosophila sequences
Nuc Blast /fdb/blastdb/drosoph.nt

See: Blast (Helix)
Blast (Biowulf)
26 Sep 2011

(Updated weekly
Source:NCBI .)

Fasta /fdb/fastadb/drosoph.nt.fas
See: Fasta, BLAT.
04 Sep 2012

(Updated weekly
Source:NCBI )

Prot Blast /fdb/blastdb/drosoph.aa
See: Blast (Helix)
Blast (Biowulf)
26 Sep 2011

(Updated weekly
Source:NCBI )

Fasta /fdb/fastadb/drosoph.aa.fas
See: Fasta, BLAT.
04 Sep 2012

(Updated weekly
Source:NCBI )


Drosophila genome
April 2006 assembly
Annotations MySQL NIH mirror of UCSC genome browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
10 May 2013

(Updated weekly
Source:UCSC)


EST
EST division of Genbank
Nuc EMBOSS /fdb/embossdb/est.new

See: EMBOSS web interface
EMBOSS command-line
18 Apr 2013

(Updated bimonthly after Genbank release
Source:NCBI .)


EST - human
Human sequences from the EST division of Genbank
Nuc Blast /fdb/blastdb/est_human

See: Blast (Helix)
Blast (Biowulf)
23 May 2012

(Updated weekly
Source:NCBI .)

Fasta /fdb/fastadb/est_human.fas
See: Fasta, BLAT.
11 Jun 2013

(Updated weekly
Source:NCBI )


EST - mouse
Mouse sequences from the EST division of Genbank.
Nuc Blast /fdb/blastdb/est_mouse

See: Blast (Helix)
Blast (Biowulf)
23 May 2012

(Updated weekly
Source:NCBI .)

Fasta /fdb/fastadb/est_mouse.fas
See: Fasta, BLAT.
11 Jun 2013

(Updated weekly
Source:NCBI )


EST - others
Non-human, non-mouse sequences from the EST division of Genbank
Nuc Blast /fdb/blastdb/est_others

See: Blast (Helix)
Blast (Biowulf)
23 May 2012

(Updated weekly
Source:NCBI .)


Gb_New
All sequences added to Genbank since last major release
Nuc EMBOSS /fdb/embossdb/gbnew.new

See: EMBOSS web interface
EMBOSS command-line
16 Jun 2013

(Updated daily
Source:NCBI .)


Genbank
The NIH Genetic Sequence Database, an annotated collection of all publicly available DNA sequences. More information at NCBI.
Nuc EMBOSS /fdb/embossdb/genbank.new

See: EMBOSS web interface
EMBOSS command-line
17 Apr 2013

(Updated bimonthly after Genbank release
Source:NCBI .)


GenPept
GenPept is produced by parsing the corresponding GenBank release for translated coding regions of GenBank sequences. More information at NCI, Frederick
Prot EMBOSS /fdb/embossdb/genpept.new
See: EMBOSS web interface
EMBOSS command-line
12 Dec 2012

(Updated bimonthly after Genbank release
Source:NCIFCRF)


GP_New
All sequences added to GenPept since last major release
Prot EMBOSS /fdb/embossdb/gpnew.new
See: EMBOSS web interface
EMBOSS command-line
02 Oct 2012

(Updated daily
Source:NCIFCRF )


HTGs
High throughput genome sequences
Nuc Blast /fdb/blastdb/htgs

See: Blast (Helix)
Blast (Biowulf)
21 Apr 2013

(Updated weekly
Source:NCBI .)


Human Genome hg18
Build 36, hg18 (Apr 2006) from the International Human Genome Consortium
Nuc Blast /fdb/genome/human-apr2006/hs_genome

See: Blast (Helix)
Blast (Biowulf)
20 May 2011

(Updated after new build release
Source:UCSC .)

Fasta /fdb/genome/human-apr2006/
See: Fasta, BLAT.
20 May 2011

(Updated after new build release
Source:UCSC)

MySQL NIH mirror of UCSC Genome Browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
31 May 2013

(Updated weekly
Source:UCSC )

Prot MySQL NIH mirror of UCSC Genome Browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
31 May 2013

(Updated weekly
Source:UCSC )

Annotations MySQL NIH mirror of UCSC Genome Browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
31 May 2013

(Updated weekly
Source:UCSC)


Human Genome hg19
Build 37, hg19 (Feb 2009) from the International Human Genome Consortium
Nuc Blast /fdb/blastdb/hs_genome

See: Blast (Helix)
Blast (Biowulf)
02 May 2013

(Updated after new build release
Source:UCSC .)

Fasta /fdb/genome/human-feb2009/
See: Fasta, BLAT.
02 May 2013

(Updated after new build release
Source:UCSC)

MySQL NIH mirror of UCSC Genome Browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
31 May 2013

(Updated weekly
Source:UCSC )

Prot MySQL NIH mirror of UCSC Genome Browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
31 May 2013

(Updated weekly
Source:UCSC )

Annotations MySQL NIH mirror of UCSC Genome Browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
31 May 2013

(Updated weekly
Source:UCSC)


Human Genome Proteins hg18
Build 36, hg18 (Apr 2006) from the International Human Genome Consortium
Prot Blast /fdb/genome/human-apr2006/hs_genome.protein
See: Blast (Helix)
Blast (Biowulf)
28 Apr 2006

(Updated after build release
Source:NCBI )

Blast /fdb/genome/human-apr2006/hs_genome.protein
See: Blast (Helix)
Blast (Biowulf)
28 Apr 2006

(Updated after build release
Source:NCBI )


Human Genome Proteins hg19
Build 37, hg19 (Feb 2009) from the International Human Genome Consortium
Prot Fasta /fdb/fastadb/hs_genome.protein.fas
See: Fasta, BLAT.
12 Apr 2010

(Updated after build release
Source:NCBI )

Blast /fdb/blastdb/hs_genome.protein
See: Blast (Helix)
Blast (Biowulf)
05 Nov 2012

(Updated after build release
Source:NCBI )


Human Genome RNA hg18
Build 36, hg18 (Apr 2006) from the International Human Genome Consortium
Nuc Blast /fdb/genome/human-apr2006/hs_genome.rna

See: Blast (Helix)
Blast (Biowulf)
28 Apr 2006

(Updated after build release
Source:NCBI .)

Fasta /fdb/genome/human-apr2006/hs_genome.rna.fas
See: Fasta, BLAT.
28 Apr 2006

(Updated after build release
Source:NCBI )


Human Genome RNA hg19
Build 37, hg19 (Feb 2009) from the International Human Genome Consortium
Nuc Fasta /fdb/fastadb/hs_genome.rna.fas

See: Fasta, BLAT.
12 Apr 2010

(Updated after build release
Source:NCBI .)

Blast /fdb/blastdb/hs_genome.rna
See: Blast (Helix)
Blast (Biowulf)
05 Nov 2012

(Updated after build release
Source:NCBI )


Mito
Mitochondrial sequences
Nuc Blast /fdb/blastdb/mito.nt

See: Blast (Helix)
Blast (Biowulf)
17 Jun 2013

(Updated weekly
Source:NCBI .)

Fasta /fdb/fastadb/mito.nt.fas
See: Fasta, BLAT.
11 Jun 2013

(Updated weekly
Source:NCBI )

Prot Blast /fdb/blastdb/mito.aa
See: Blast (Helix)
Blast (Biowulf)
17 Jun 2013

(Updated weekly
Source:NCBI )

Fasta /fdb/fastadb/mito.aa.fas
See: Fasta, BLAT.
11 Jun 2013

(Updated weekly
Source:NCBI )


Mouse Genome mm8
Build 36, mm8, Mar 2006 from the Mouse Genome Consortium
Nuc Blast /fdb/genome/mouse-mar2006/mouse_genome

See: Blast (Helix)
Blast (Biowulf)
09 Nov 2006

(Updated after new build release
Source:UCSC .)

Fasta /fdb/genome/mouse-mar2006/
See: Fasta, BLAT.
08 Jul 2010

(Updated after new build release
Source:UCSC )

MySQL NIH mirror of UCSC Genome Browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
24 May 2013

(Updated weekly
Source:UCSC )

Prot MySQL NIH mirror of UCSC Genome Browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
24 May 2013

(Updated weekly
Source:UCSC )

Annotations MySQL NIH mirror of UCSC Genome Browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
24 May 2013

(Updated weekly
Source:UCSC )


Mouse Genome mm9
Build 37, mm9, Jul 2007 from the Mouse Genome Consortium
Nuc Blast /fdb/blastdb/mouse_genome

See: Blast (Helix)
Blast (Biowulf)
25 Mar 2008

(Updated after new build release
Source:UCSC .)

Fasta /fdb/genome/mouse-jul2007/
See: Fasta, BLAT.
06 Apr 2011

(Updated after new build release
Source:UCSC )

MySQL NIH mirror of UCSC Genome Browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
24 May 2013

(Updated weekly
Source:UCSC )

Prot MySQL NIH mirror of UCSC Genome Browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
24 May 2013

(Updated weekly
Source:UCSC )

Annotations MySQL NIH mirror of UCSC Genome Browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
24 May 2013

(Updated weekly
Source:UCSC )


Mouse Genome Proteins mm8
Build 36, mm8, Mar 2006 from the Mouse Genome Consortium
Prot Blast /fdb/genome/mouse-mar2006/mouse_genome.protein
See: Blast (Helix)
Blast (Biowulf)
09 Nov 2006

(Updated weekly
Source:NCBI )

Fasta /fdb/genome/mouse-mar2006/mouse_genome.protein.fas
See: Fasta, BLAT.
09 Nov 2006

(Updated weekly
Source:NCBI )


Mouse Genome Proteins mm9
Build 37, mm9, Jul 2007 from the Mouse Genome Consortium
Prot Blast /fdb/genome/mouse-mar2006/mouse_genome.protein
See: Blast (Helix)
Blast (Biowulf)
09 Nov 2006

(Updated weekly
Source:NCBI )

Fasta /fdb/fastadb/mouse_genome.protein.fas
See: Fasta, BLAT.
25 Mar 2008

(Updated weekly
Source:NCBI )


Mouse Genome RNA mm8
Build 36, mm8, Mar 2006 from the Mouse Genome Consortium
Nuc Blast /fdb/genome/mouse-mar2006/mouse_genome.rna

See: Blast (Helix)
Blast (Biowulf)
09 Nov 2006

(Updated after release
Source:NCBI .)


Mouse Genome RNA mm9
Build 37, mm9, Jul 2007 from the Mouse Genome Consortium
Nuc Blast /fdb/blastdb/mouse_genome.rna

See: Blast (Helix)
Blast (Biowulf)
22 Oct 2012

(Updated after release
Source:NCBI .)

Fasta /fdb/fastadb/mouse_genome.rna.fas
See: Fasta, BLAT.
25 Mar 2008

(Updated after release
Source:NCBI )


MSDB
A nonredundant protein sequence database designed specifically for mass-spec applications.
Prot Mascot biospec.nih.gov
Mascot search engine
01 Jun 2010

(Updated weekly
Source:NCBI )


NCBI nr
NCBI's nonredundant Genbank CDS translations + PDB + SwissProt
Prot Blast /fdb/blastdb/nr
See: Blast (Helix)
Blast (Biowulf)
07 Jun 2013

(Updated weekly
Source:NCBI )

Fasta /fdb/fastadb/nr.aa.fas
See: Fasta, BLAT.
11 Jun 2013

(Updated weekly
Source:NCBI )

Mascot biospec.nih.gov
Mascot search engine
16 Jun 2013

(Updated weekly
Source:NCBI )


NCBI nt
All GenBank+EMBL+DDBJ (but no EST, STS, GSS, HTG). No longer nonredundant.
Nuc Blast /fdb/blastdb/nt

See: Blast (Helix)
Blast (Biowulf)
08 Jun 2013

(Updated weekly
Source:NCBI .)

Fasta /fdb/fastadb/nt.fas
See: Fasta, BLAT.
11 Jun 2013

(Updated weekly
Source:NCBI )


NIH-Specific
A collection of NIH-specific databases requested by NIH Mascot users.
Prot Mascot biospec.nih.gov
Mascot search engine
15 May 2013

(Updated as requested
Source:NIH)


PFAM
A collection of multiple sequence alignments and hidden Markov models. More information at PFAM home page
Families PFAM /fdb/fastadb/pfam
HMMER (Biowulf, Helix)
23 Mar 2009

(Updated every 3 months
Source:PFAM )


Prints
Protein fingerprints, groups of conserved motifs used to characterize a protein family.
Patterns EMBOSS used internally by Emboss
See: EMBOSS web interface
EMBOSS command-line
22 Apr 2013

(Updated after new Prints release
Source:EBI)


Prosite
A database/dictionary of protein sites and patterns. More information at Expasy.
Patterns EMBOSS used internally by Emboss
See: EMBOSS web interface
EMBOSS command-line
15 May 2013

(Updated every 2 months
Source:Expasy )


Protein Data Bank
An archive of experimentally determined three-dimensional strtures of biological macromolecules. More information at the PDB.
Nuc Blast /fdb/blastdb/pdbnt

See: Blast (Helix)
Blast (Biowulf)
10 Jun 2013

(Updated weekly
Source:NCBI .)

Fasta /fdb/fastadb/pdb.nt.fas
See: Fasta, BLAT.
11 Jun 2013

(Updated weekly
Source:NCBI )

Prot Blast /fdb/blastdb/pdbaa
See: Blast (Helix)
Blast (Biowulf)
10 Jun 2013

(Updated weekly
Source:NCBI )

Fasta /fdb/fastadb/pdb.aa.fas
See: Fasta, BLAT.
11 Jun 2013

(Updated weekly
Source:NCBI )

3-D PDB /pdb/pdb
Molecules R Us or direct access to coordinate files.
NIH users can NFS-mount the PDB databases on their own machines -- contact staff@helix.nih.gov for more info.
17 Jun 2013

(Updated daily
Source:PDB )


Rat Genome
May 2006 build, rn4, from the Rat Genome Sequencing Consortium
Nuc MySQL NIH mirror of UCSC genome browser

Also available for direct MySQL queries from the Biowulf cluster nodes.
10 May 2013

(Updated weekly
Source:UCSC .)

Prot MySQL NIH mirror of UCSC genome browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
10 May 2013

(Updated weekly
Source:UCSC )

Annotations MySQL NIH mirror of UCSC genome browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
10 May 2013

(Updated weekly
Source:UCSC )


REBASE
About restriction enzymes, recognition sequences, cleavage sites... More information at REBASE.
Enzymes EMBOSS used internally by Emboss
See: EMBOSS web interface
EMBOSS command-line
12 Jun 2013

(Updated every month
Source:REBASE )


Refseq Human Genomic
Refseq Human (NC_######) chromosome records with gap adjusted concatenated NT_ contigs
Nuc Blast /fdb/blastdb/human_genomic

See: Blast (Helix)
Blast (Biowulf)
12 Jun 2013

(Updated weekly
Source:NCBI.)

Fasta /fdb/fastadb/ref.human.genomic.fas
See: Fasta, BLAT.
11 Jun 2013

(Updated weekly
Source:NCBI)


Refseq Human Proteins
A comprehensive, integrated, non-redundant set of sequences. More info at NCBI
Prot Blast /fdb/blastdb/human.protein
See: Blast (Helix)
Blast (Biowulf)
17 Jun 2013

(Updated weekly
Source:NCBI)

Fasta /fdb/fastadb/ref.human.protein.fas
See: Fasta, BLAT.
11 Jun 2013

(Updated weekly
Source:NCBI)


Refseq Human RNA
A comprehensive, integrated, non-redundant set of sequences. More info at NCBI
Nuc Blast /fdb/blastdb/human.rna

See: Blast (Helix)
Blast (Biowulf)
17 Jun 2013

(Updated weekly
Source:NCBI.)

Fasta /fdb/fastadb/ref.human.rna.fas
See: Fasta, BLAT.
11 Jun 2013

(Updated weekly
Source:NCBI)


Refseq Mouse Proteins
A comprehensive, integrated, non-redundant set of sequences. More info at NCBI
Prot Blast /fdb/blastdb/mouse.protein
See: Blast (Helix)
Blast (Biowulf)
17 Jun 2013

(Updated weekly
Source:NCBI)

Fasta /fdb/fastadb/ref.mouse.protein.fas
See: Fasta, BLAT.
11 Jun 2013

(Updated weekly
Source:NCBI)


Refseq Mouse RNA
A comprehensive, integrated, non-redundant set of sequences. More info at NCBI
Nuc Blast /fdb/blastdb/mouse.rna

See: Blast (Helix)
Blast (Biowulf)
17 Jun 2013

(Updated weekly
Source:NCBI.)

Fasta /fdb/fastadb/ref.mouse.rna.fas
See: Fasta, BLAT.
11 Jun 2013

(Updated weekly
Source:NCBI)


Refseq Other Genomic
RefSeq chromosome records (NC_######) for organisms other than human
Nuc Blast /fdb/blastdb/other_genomic

See: Blast (Helix)
Blast (Biowulf)
28 Feb 2013

(Updated weekly
Source:NCBI .)

Fasta /fdb/fastadb/ref.other.genomic.fas
See: Fasta, BLAT.
11 Jun 2013

(Updated weekly
Source:NCBI)


Refseqaa
NCBI's comprehensive, integrated, non-redundant set of protein sequences for major research organisms.
Prot EMBOSS /fdb/embossdb/refseqaa.new
See: EMBOSS web interface
EMBOSS command-line
08 May 2013

(Updated weekly
Source:NCBI)


Refseqnt
NCBI's comprehensive, integrated, non-redundant set of sequences, including genomic DNA, transcript (RNA) for major research organisms.
Nuc EMBOSS /fdb/embossdb/refseqnt.new

See: EMBOSS web interface
EMBOSS command-line
08 May 2013

(Updated weekly
Source:NCBI.)


Rhesus genome
Jan 2006 assembly from the Baylor Sequencing Center.
Nuc MySQL NIH mirror of UCSC genome browser

Also available for direct MySQL queries from the Biowulf cluster nodes.
25 Jan 2013

(Updated weekly
Source:UCSC .)

Prot MySQL NIH mirror of UCSC genome browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
25 Jan 2013

(Updated weekly
Source:UCSC )

Annotations MySQL NIH mirror of UCSC genome browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
25 Jan 2013

(Updated weekly
Source:UCSC )


Sp_Trembl
SwissProt + Trembl (a computer-annotated supplement of SwissProt)
Prot Mascot biospec.nih.gov
Mascot search engine
02 Jun 2013

(Updated weekly
Source:Expasy )


SwissProt
A highly-annotated, curated protein sequence database. Minimal redundancy and high level of integration with other databases. More information at Expasy
Prot Blast /fdb/blastdb/swissprot
See: Blast (Helix)
Blast (Biowulf)
17 Jun 2013

(Updated weekly
Source:NCBI)

Fasta /fdb/fastadb/swissprot.aa.fas
See: Fasta, BLAT.
11 Jun 2013

(Updated weekly
Source:NCBI )

Mascot biospec.nih.gov
Mascot search engine
02 Jun 2013

(Updated weekly
Source:Expasy )


UniProt
(Swissprot + Trembl) A highly-annotated, curated protein sequence database. Minimal redundancy and high level of integration with other databases. More information at Expasy
Prot EMBOSS /fdb/embossdb/uniprot
See: EMBOSS web interface
EMBOSS command-line
29 May 2013

(Updated weekly
Source:Uniprot)


Yeast
Yeast sequences
Nuc Blast /fdb/blastdb/yeast.nt

See: Blast (Helix)
Blast (Biowulf)
26 Sep 2011

(Updated weekly
Source:NCBI .)

Fasta /fdb/fastadb/yeast.nt.fas
See: Fasta, BLAT.
04 Sep 2012

(Updated weekly
Source:NCBI )

Prot Blast /fdb/blastdb/yeast.aa
See: Blast (Helix)
Blast (Biowulf)
26 Sep 2011

(Updated weekly
Source:NCBI )

Fasta /fdb/fastadb/yeast.aa.fas
See: Fasta, BLAT.
30 Jun 2011

(Updated weekly
Source:NCBI )


Zebrafish genome
Mar 2006 assembly from the Sanger Center.
Nuc MySQL NIH mirror of UCSC genome browser

Also available for direct MySQL queries from the Biowulf cluster nodes.
29 Nov 2011

(Updated weekly
Source:UCSC .)

Prot MySQL NIH mirror of UCSC genome browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
29 Nov 2011

(Updated weekly
Source:UCSC )

Annotations MySQL NIH mirror of UCSC genome browser
Also available for direct MySQL queries from the Biowulf cluster nodes.
29 Nov 2011

(Updated weekly
Source:UCSC )