Staden, developed by James Bonfield and colleagues at the MRC, UK, is a fully developed set of DNA sequence assembly(Gap4), editing and analysis(Spin) tools.
A list of Staden programs and their descriptions is in the Staden package program summary.
To run the Staden packages, you need to type the following commands:
Bash users: . /usr/local/staden/share/staden/staden.profile Csh/tcsh users: source /usr/local/staden/share/staden/staden.csh
First login to helix. Replace 'user' in the following example with user's helix ID. Create a directory called, for example, /home/user/pregap_intro
% mkdir /home/user/pregap_intro
% cd /home/user/pregap_intro
Obtaining an initial set of ABI sequencer data:
Obtaining copies of the vector sequences for screening the reading:
For this exercise, the required cloning vector (lorist2) sequence file lorist2.vector is in the same directory as your ABI data. The sequencing vector is m13mp18 and is already held in the Staden Package installation directory. The cloning site used is SmaI.
Using pregap4 to prepare a set of ABI sequencer files for entry into a sequencing project database:
Make sure X-windows application is started. Make sure your current directory is /home/user/pregap_intro
A Pregap4 window pops up:
Click tab Files to Process, Add files, change to pregap4_intro directory, select Any as the file type. The browser should show all the files listed in this directory. We are to process all of the binary sample files which contain the raw trace data alone with sequence and other information.
Select all the Sample XXX files by holding down the control key and clicking on the five files, XXX is a three digit number. Press OK
- Click Configure Modules tab
- Select General Configuration on left; on the right, get entry names from trace files select yes
- Estimate Base Accuracies [x]
- Trace Format Conversation [x]
- Initialise Experiment Files [x]
- Quality Clip [x]
- Sequencing Vector Clip [x]; Use vector-primer file Yes; Click on the Select vector-primer file subset; select m13mp18/SmaI; click OK
- Screen For Unclipped Vector [x]
- Cloning Vector Clip, enter lorist2.vector in the Vector file name box
- Interactive Clipping [x]
Trev is a grphical tool that allows you to:
- Edit the sequence of your reads
- Adjust the left and right quality clip points, determined by qclip
- Adjust the left and right vector clip points, determined by vector_clip
When trev is run as a pregap4 module, only the second two functions should be considered. By default, sequence editing is disabled in this context. For more info, see section 1.7 of the documentation.
Click File, save. A bunch of new files will be created by pregap4.
Customizing the modules of Pregap4
Select Modules, Add/Remove Modules, user can change modules freely. After finish, click File, Save module list. A new file called pregap4.config will be created.
Select file, exit to quit pregap4
Copy sample files from /usr/local/staden/course/data/phred_data/ into a new directory. In the following example, a couple of hundred ZTR files and a lorist6.vector file were copied into /home/user/exercise.
In the pregap4 window, Add files, ZTR, select all files by clicking any file then Ctrl-A, then press OK
Click Configure Modules tab
Disable Estimate Base Accuracies
Disable Trace Format Conversion
Enable Initialise Experiment Files
Enable Augment Experiment Files, click Experiment File Line Types, do this:
Click OK and save
Enable Quality Clip
Enable Sequencing Vector Clip
Enable Screen for Unclipped Vector
Enable Cloning Vector Clip, specify lorist6.vector as the vector file name
Enable Gap4 shotgun assembly, type in testdb as Gap4 database name, tick Create new database option. click on any other module name. This will change edit to ok next to the Gap4 shotgun assembly module.
Disable everything else.
Under File, Load Naming Scheme, click Browse, select sanger_names_old.p4t, ok, ok.
Click File, Save All Parameters (in all modules)
Type pregap4 -nowin *.ztr on unix window.
When finished, a gap4 database is created. These are files named testdb.0*
Following example above, under /home/user/exercise, type /usr/local/staden/bin/gap4
Open database by File, Open, testdb.0.aux, ok
Contig Selector window will open
Select Edit, Contig Editor, OK, following window show:
Editing the consensus sequence
There are two types of editing action available, replace and insert. When you start the contig editor it will be in replace mode. You can toggle between modes by clicking (left mouse button) on the box that is labelled Insert. The contig editor will allow you to edit anything in any way. See section 2.6.4 in the documentation.
Finding problems and editing them
The places in your contig that will most probably require editing are where the consensus sequence is undetermined. Click Next Search, problem, forward, click search button. You will find * characters with very occasional -.
By altering the value in the box labelled Qual in the contig editor, user control a simple display showing the quality of bases, although there are better ways of showing this as you will see later. Increasing the Qual value will gradually turn base red (when they have a confidence value lower than the Qual value).
Checking the trace data
You can get gap4 to automatically display the traces which would best be used for verifying and solving problems. Click settings, trace display, auto-display traces. From now on the Search button will display up to three traces when searching for problems:
For more info, see documentation
Editing with confidence
Setup gap4 for use with confidence values. In GAP4 main window, options, consensus algorithm, ok:
Now to see what the phred case-calls look like, edit, edit contig, ok,setting, show reading quality and show consensus quality and highlight desagreement and by background color:
To list the error rates:
In contig editor, commands, List Confidence, accept default and click Apply. the editor information line (at the bottom of the window) now contains something like the following:
Expected no. of errors between 1 and 9569 is 3.87. Error rate = 1/2471
The main gap4 output window should show: