| GCG program |
EMBOSS program |
Description/Comments |
|
Assemble
|
merger
union |
Construct
new sequences from pieces of existing sequences; merger only accepts
2 sequences while assemble and union accept several. |
|
BackTranslate
|
backtranseq
backtranambig |
Backtranslate
protein -> nucleotide sequence. backtranambig
backtranslates to ambiguous codons. |
|
BestFit
|
water
matcher |
Bestfit
uses the Smith-Waterman algorithm to find the best local alignment
between 2 sequences. water uses Smith-Waterman, matcher uses Pearson's
lalign algorithm. |
BLAST
Psiblast
|
dbiBlast
|
NCBI
homology search between query and database |
|
Breakup
|
splitter |
Splits
a sequence into (overlapping) smaller sequences |
|
Chopup |
- |
Helps
to convert a non-GCG sequence format
Not needed in EMBOSS because it reads most sequence formats without
conversion |
|
CodonFrequency
|
chips
compseq
cusp |
CodonFrequency
--tabulates codon usage.
compseq -- counts composition of dimer/trimer in sequence.
chips -- calculates codon usage stats
cusp -- creates a codon usage table. |
|
CodonPreference
|
syco
wobble |
Recognize
protein coding sequences |
| CoilScan |
pepcoil |
Predicts
coiled-coil regions |
| Compare
+ Dotplot
|
dottup
+ dotmatcher
dotpath |
2-sequence
comparison. dotpath does a non-overlapping wordmatch dotplot. |
|
Composition
|
compseq
pepstats |
Sequence
composition |
|
CompressText
|
-
|
Removes
extra whitespace in text files. Can be done via Unix shell script.
|
|
CompTable
|
-
|
Creates
a scoring matrix |
|
Consensus
|
prophecy
|
Creates
a consensus sequence
or matrices/profiles from multiple alignments |
|
Correspond
|
codcmp
|
Codon
usage table comparison |
|
Corrupt
|
msbar
|
Randomly
mutate sequence |
|
DataSet
|
dbiflat
dbiblast
dbigcg |
Creates
searchable sequence database. GCG's Dataset requires sequences in
GCG format, whereas dbiflat, dbiblast, dbigcg will take most formats
between them. |
|
Detab |
- |
Replaces
tabs with spaces in sequence files. Can be performed by Unix shell
command. |
|
Distances
|
-
|
Calculates
pairwise evolutionary distances between aligned sequences. The Phylip package
can do this. |
|
Diverge
|
-
|
Estimates
pairwise substitutions per site between 2 or more coding sequences.
The Phylip package can do this. |
|
DotPlot
|
dottup
dotmatcher
|
2-sequence comparison |
|
ExtractPeptide |
transeq
|
ExtractPeptide
takes the output of Map and can write one or more of the reading-frame
translations. transeq translates one or more of the frames or specific
regions directly from an input nucleotide sequence. |
FastA
FastX
Tfasta
TfastX |
- |
Pearson's
homology-search program, available as a standalone program.
|
| Fetch |
seqret
seqretsplit |
Pull
one or more sequences out of the databases. seqret/seqretsplit can
save output in various sequence formats. |
| Figure |
- |
Generates
plots from other GCG programs. The equivalent EMBOSS
programs usually generate plots (e.g. plotorf). |
| FindPatterns |
fuzznuc
fuzzpro |
searches
for patterns in a sequence or database |
| FingerPrint |
- |
Finds
the products of T1 ribonuclease digestion. |
| FitConsensus |
- |
Use
after Consensus to find the best fits. |
| FrameAlign |
- |
Finds
best local alignment including frame shifts between a protein and
nucleotide sequence. |
| Frames |
plotorf
showorf |
Show
open reading frames. plotorf does this graphically |
| Framesearch |
- |
Homology
searches including frameshifts between protein and nucleotide sequences
|
FromEMBL
FromFasta
FromGenbank
FromIG
FromPIR
FromStaden
Fromtrace |
-
|
Converts
from various formats to GCG sequence format. Unnecessary in EMBOSS
because it can accept most sequence formats, but seqret can convert
between formats if desired. |
| Gap |
needle
stretcher |
Needleman-Wunsch
algorithm to compare 2 sequences. stretcher uses the Myers-Miller
algorithm which is more memory-efficient. For sequences
larger than 10kb, I would suggest you to use 'stretcher' program in EMBOSS which is also a global
alignment program. If one of your sequence is genomic
and you are trying to align an est sequence to it, you
may want to consider the 'est2genome' program. On the
other hand, water->matcher->supermatcher are local
alignment programs for small, medium, and large sequences, respectively. |
| Gapshow |
plotcon |
Graphical
representation of similarity of 2 sequences. |
| GCGtoBlast |
- |
Makes
a Blast database. Use NCBI's 'formatdb' instead. |
GelAssemble
GelDisassemble
GelEnter
GelMerge
GelStart
GelView
|
megamerger
merger
union |
Parts
of GCG's gel assembly suite. |
| Getseq |
seqret |
Type
in a new sequence |
| Growtree |
- |
Creates
phylogenetic tree. Can use Phylip or Clustal instead. |
| HelicalWheel |
pepwheel |
Plots
peptide sequence as helical wheel to help recognize amphiphilic
regions. |
HmmerAlign
HmmerBuild
HmmerCalibrate
HmmerEmit
HmmerFetch
HmmerIndex
HmmerPfam
HmmerSearch |
- |
Sean
Eddy's HMMER package, available on Doublehelix and Biowulf.
|
| HTHScan |
helixturnhelix |
Finds
HTH motifs in protein sequences. |
| Isoelectric |
iep |
Calculates
isoelectric pt of protein. |
| Lineup |
- |
Edits
multiple sequence alignments |
| Listfile |
- |
for
printing. Can use Unix pcprint command instead. |
| Lookup |
- |
Versatile
program for finding sequences in a database. whichdb in emboss can
search for accession numbers, but GCG's lookup is much more sophisticated.
Use NCBI
Entrez instead. |
Map
Mapplot
Mapsort |
restrict
remap
restover |
finds
restriction enzyme cleavage sites.
GCG & EMBOSS may display different isoschizomers of the same enzyme,
but the results are equivalent. The EMBOSS remap program may not
display a few of the available isoschizomers.
|
| MeltTemp |
dan
|
Computes
melting temperature of oligos |
| MEME |
- |
Finds
conserved motifs in a group of unaligned sequences. Use
the standalone Meme/Mast on Doublehelix (short
interactive jobs)
or Biowulf
(long batch jobs) |
| MFold |
- |
Predicts
nucleotide secondary structure. GCG's version is an old version
of Zuker's MFOLD, use the standalone MFOLD instead. |
| Moment |
pepnet,octanol
hmoment |
Makes
a contour plot of the helical hydrophobic moment of a
peptide sequence
hmoment prints the text output of the calculation. |
| Motifs |
patmatmotifs |
Finds
common Prosite motifs in a sequence. Use '-full' tag to
display abstract information when using EMBOSS
patmatmotifs. Note that both these programs will only
find Prosite 'Patterns' (e.g. CAMP Phosphorylation
Site),
and not Prosite 'Matrices' (e.g. Helix-turn-Helix).
Use Interproscan
to find all known domains and functional
sites. (http://www.ebi.ac.uk/InterProScan/). patmatmotifs can accept file containing multiple sequences or patterns. |
| Meme
+ Motifsearch |
prophecy
+ profit |
Search
a sequence or database with a matrix or profile. |
| Names |
infoseq |
provides
some info about sequence specifications. |
NetBlast
Netfetch |
- |
remote
access to NCBI's Blast. Use standalone Blast
on Helix instead. |
| NoOverlap |
diffseq |
Finds
differences between 2 sequences. NoOverlap can work with a group
of sequences. |
| OldDistances |
- |
Makes
a table of the pairwise similarities within a group of sequenes.
|
| onecase |
- |
converts
sequence into lower or upper case. Can be performed by Unix shell
command. |
| Overlap |
- |
Compares
2 sets of sequences using Wilbur-Lipman algorithm. |
Paupdisplay
Paupsearch |
- |
PAUP
Phylogenetic Analysis. Use the standalone PAUP on
Doublehelix instead. |
| Pepdata |
getorf
sixpack |
Translates
in all 6 reading frames. sixpack displays the DNA sequence with
6-frame translations and orfs. |
| Pepplot |
pepinfo |
Pepplot
plots protein 2ndary structure and hydrophobicity. pepinfo plots
hydrophobicity, and garnier does protein 2ndary structure prediction.
|
| Peptidemap |
digest |
Enzyme/reagent
cleavage map of a protein. |
| Peptidesort |
digest pepstats |
GCG peptidesort sorts fragments from an
enzyme/reagent cleavage of one or more proteins according
to position, mol. wt., and HPLC retention. EMBOSS digest
only processes one reagent cleavage at a time. EMBOSS pepstats can be used to
determine the composition of the fragments afterwards. |
Peptidestructure
Plotstructure |
garnier
antigenic
pepwindow
pepwindowall |
Secondary
structure prediction. Garnier does not include
Jameson-Wolf antigenic indexing. antigenic predicts
potentially antigenic regions of a protein sequence,
using the method of Kolaskar and Tongaonkar. pepwindow
displays Kyte-Doolittle protein hydropathy.
pepwindowall produces a set of superimposed Kyte &
Doolittle hydropathy plots from an aligned set of
protein sequences. |
| Pileup |
emma |
Multiple
sequence alignment. emma is an interface to ClustalW.
Can also use the standalone Clustal, or web ClustalW.
|
| PlasmidMap |
cirdna
lindna |
Plot
DNA constructs. |
| PlotFold |
- |
Plots
MFold output. Use the standalone MFOLD instead, which is more up-to-date and
makes output plots in postscript. |
| PlotSimilarity |
plotcon |
Graphical
representation of the similarity along a set of aligned sequences.
|
Pretty
prettybox |
cons
prettyplot
showalign |
Calculates
consensus sequence from a multiple sequence alignment, and displays
them prettily. |
| Prime |
eprimer3 |
Selects
oligonucleotide primers. |
Profilegap
Profilemake |
prophecy
prophet
distmat |
Creates
matrices/profiles from multiple alignments. Gapped alignment for
profiles and sequences. |
| PrimePair |
primersearch? |
Evaluates
individual primers to determine their compatibility for use as PCR
primer pairs. |
| Profilescan |
patmatdb |
Searches
sequences or db for protein motifs. Profilescan uses Gribskov method.
|
| Profilesearch |
profit |
Scans
a sequence or database with a matrix or profile. |
| Profilesegments |
- |
Alignments
for results of Profilesearch |
| Publish |
seqret
showseq |
Makes
publication-quality displays of sequences. |
| Reformat |
seqret |
GCG
requires input sequences to be in GCG format, hence other formats
need to be converted with 'reformat'. Emboss programs accept most
sequence formats, so conversion is rarely required, but 'seqret' can be used to convert between formats
if desired. |
| Repeat |
equicktandem
etandem
einverted
palindrome |
Finds
tandem repeats in sequences. The equivalent group of Emboss programs
will also look for inverted or palindromic repeats. |
| Replace |
biosed
degapseq |
Replaces
characters in a text file. Degapseq is specific for replacing gap
characters. Can be performed with Unix shell utilities like sed,
awk or tr. |
| Reverse |
revseq |
Reverse/complement
a sequence. |
| Sample |
extractseq |
Extract
regions from a sequence. |
| Seg |
maskseq |
Masks
off low-complexity regions from a sequence. |
| Seqed |
biosed,
cutseq,
degapseq,
descseq,
entret,
extractfeat,
extractseq,
listor,
maskfeat,
maskseq,
newseq,
noreturn,
notseq,
nthseq,
pasteseq,
revseq,
seqret,
seqretsplit,
skipseq,
splitter,
trimest,
trimseq,
union,
vectorstrip,
yank
|
Sequence
editor. EMBOSS has several tools for specific editing tasks. Or use
a text editor (not word processor!).
Try the Jemboss
alignment editor for editing multiple sequence
alignments.
Other alternatives are BioEdit (Windows only)
and Seaview (Mac, Windows, Unix).
Seaview is
available on the Helix Systems..
|
| SeqLab |
- |
X-windows
interface to GCG. |
| Setkeys |
- |
Redefines
keyboard keys, mainly used for GCG's gel assembly programs. |
| Shiftover |
- |
Moves
text by column. Use the nedit editor instead. |
| Shuffle |
shuffleseq |
Shuffles
a sequence. |
| Simplify |
- |
Reduce
the number of symbols in a sequence. |
| Spew |
- |
Sends
a sequence from a remote computer (e.g. Helix) to your desktop.
Use one of the File
transfer mechanisms instead. |
| SPScan |
sigcleave |
Predicts
signal peptides in protein sequences. |
| Ssearch |
- |
Part
of Pearson's Fasta package, available as a standalone program on
Helix. |
| StatPlot |
- |
Plotting
program. Rarely used. |
| StemLoop |
palindrome
etandem |
Finds
inverted repeats. |
| Stringsearch |
textsearch |
Finds
text phrases in sequence or database. Use NCBI's Entrez instead. |
| Terminator |
- |
searches
for prokaryotic factor-independent RNA polymerase terminators according
to the method of Brendel and Trifonov. |
| Testcode |
wobble |
Plots
3rd-position variability as an indicator of potential coding regions.
|
ToFastA
ToIG
ToPIR
ToStaden
|
seqret |
Emboss
accepts most sequence formats, therefore format conversion is rarely
required. seqret can be used to convert between formats
if desired. |
| Translate |
transeq |
Translates
nucleotide -> Protein sequences |
| Transmem |
- |
predicts
transmembrane helices. |
| Window
+ Statplot |
freak |
Residue/base
frequency table or plot. |
| Wordsearch/Segments |
- |
Homology
search using Wilbur/Lipman algorithm. Segments displays the result.
|
| Xnu |
- |
Masks
tandem repeats for future Blast search. |
| - |
abiview |
Reads
ABI file and displays trace |
| - |
antigenic |
Finds
antigenic sites in proteins |
| - |
banana |
Bending
and curvature plot in B-DNA |
| - |
btwisted |
Calculates
the twisting in a B-DNA sequence |
| - |
cai |
CAI
codon adaptation index, to measure synonymous codon usage bias.
|
| - |
chaos |
Create
a chaos game representation plot for a sequence |
| - |
charge |
Protein
charge plot. |
| - |
checktrans |
Reports
STOP codons and ORF statistics of a protein |
| - |
coderet |
Extract
CDS, mRNA and translations from feature tables |
| - |
cpgplot
cpgreport
newcpgreport
newcpgseek |
Plots
and reports CpG-rich regions. |
| seqed |
cutseq |
Removes
a specified section from a sequence. seqed is interactive, cutseq
is command-line. |
| seqed |
degapseq |
Alter
name/description of sequence. |
| Findpatterns |
dreg |
Regular
expression search of a sequence. Findpatterns is an approximate
equivalent. |
| - |
emma |
interface
to ClustalW program. |
| - |
emowse |
Protein
identification by Mass spectrometry. |
| - |
epestfind |
Finds PEST motifs as potential proteolytic cleavage
sites |
| - |
est2genome |
Align
EST and genomic DNA sequences. |
| - |
extractfeat |
Extract
features from a sequence. |
| - |
findkm |
Find
Km and Vmax for an enzyme reaction by a Hanes/Woolf plot |
| - |
fuzztran |
Protein
pattern search after translation |
| - |
geecee |
Calculates
the fractional GC content of nucleic acid sequences |
| - |
isochore |
Plots
isochores in large DNA sequences |
| - |
listor |
Writes
a list file of the logical OR of two sets of sequences |
| - |
makenucseq
makeprotseq |
Create random nucleotide and protein sequences |
| - |
marscan |
Finds
MAR/SAR sites in nucleic sequences |
| - |
maskfeat |
Mask
off features of a sequence. |
| - |
mwcontam |
Shows
molwts that match across a set of files |
| - |
mwfilter |
Filter
noisy molwts from mass spec output |
| - |
noreturn |
remove
carriage return from a ASCII files. Can be performed by Unix utilities
like 'tr'. |
| Reformat |
nthseq |
Pulls
one sequence out of a multiple set. Reformat will pull a sequence
out of an MSF or RSF file. |
| - |
oddcomp |
Finds
protein sequence regions with a biased composition |
| - |
polydot |
Displays
all-against-all dotplots of a set of sequences |
| - |
printsextract |
Extract
data from PRINTS |
| - |
pscan |
Scans
proteins using PRINTS |
| - |
rebaseextract
redata |
Search
and extract from REBASE. |
| - |
recoder |
Remove
restriction sites but maintain the same translation |
| - |
seqmatchall |
all-against-all
comparison of a set of sequences. |
| - |
showdb |
Shows
info about currently available databases. |
| - |
showfeat |
Shows
features of a sequence |
| - |
silent |
Silent
mutation restriction enzyme scan |
| - |
sirna |
Finds
siRNA duplexes in mRNA |
| - |
stssearch |
Searches
a DNA database for matches with a set of STS primers |
| - |
supermatcher |
Finds
a match of a large sequence against one or more sequences |
| - |
tfextract |
Extract
data from TRANSFAC database. |
| gcghelp |
tfm |
shows
documentation for a program. |
| - |
tfscan |
Scans
DNA sequences for transcription factors |
| - |
tmap |
Displays
membrane spanning regions |
| - |
tranalign |
Align
nucleic coding regions given the aligned proteins |
| - |
trimest
trimseq |
Trim
bits off ends of sequences. Can be done interactively with GCG's
seqed. |
| - |
twofeat |
inds
neighbouring pairs of features in sequences |
| - |
vectorstrip |
Strips
out DNA between a pair of vector sequences |
| - |
wordcount |
Counts
words of a specified size in a DNA sequence |
| -
|
wordmatch
|
Finds
all exact matches of a given size between 2 sequences |