1Changes with 2.0u4 (February, 1996)
2
3Added '-L' option, which provides a longer discription of the library
4sequence.
5
6Fixed a bug in the -m 10 parseable output.
7
8Support is now provided for version 8.0 GCG libraries, both protein
9and DNA. Use library type 6.
10
11Changes with 2.0x4  (January, 1996)
12
13The major change in with 2.0x4 is the ability to get a parseable
14output from FASTA/TFASTA/SSEARCH.  This can be done using output
15option -m 10.  With -m 10, the initial histogram and list of best
16scores is unchanges, but the alignments are now in a parseable form:
17
18>>>mgstm1.aa, 217 aa vs s library
19; pg_name: FASTA
20; pg_ver: version 2.0x4 Jan., 1996
21; pg_matrix: BLOSUM50
22; pg_gap-pen: -12 -2
23; pg_ktup: 1
24; pg_optcut: 30
25; pg_cgap: 42
26>>GTB1_MOUSE GLUTATHIONE S-TRANSFERASE GT8.7 (EC 2.5.1.18
27; fa_initn: 1490
28; fa_init1: 1490
29; fa_opt: 1490
30; fa_z-score: 1916.0
31; fa_expect:      0
32; sw_score: 1490
33; sw_ident: 1.000
34; sw_overlap: 217
35>GT8.7  ..
36; sq_len: 217
37; sq_type: p
38; al_start: 1
39; al_stop: 217
40; al_display_start: 1
41PMILGYWNVRGLTHPIRMLLEYTDSSYDEKRYTMGDAPDFDRSQWLNEKF
42KLGLDFPNLPYLIDGSHKITQSNAILRYLARKHHLDGETEEERIRADIVE
43NQVMDTRMQLIMLCYNPDFEKQKPEFLKTIPEKMKLYSEFLGKRPWFAGD
44KVTYVDFLAYDILDQYRMFEPKCLDAFPNLRDFLARFEGLKKISAYMKSS
45RYIATPIFSKMAHWSNK
46>GTB1_MOUSE ..
47; sq_len: 217
48; sq_type: p
49; al_start: 1
50; al_stop: 217
51; al_display_start: 1
52PMILGYWNVRGLTHPIRMLLEYTDSSYDEKRYTMGDAPDFDRSQWLNEKF
53KLGLDFPNLPYLIDGSHKITQSNAILRYLARKHHLDGETEEERIRADIVE
54NQVMDTRMQLIMLCYNPDFEKQKPEFLKTIPEKMKLYSEFLGKRPWFAGD
55KVTYVDFLAYDILDQYRMFEPKCLDAFPNLRDFLARFEGLKKISAYMKSS
56RYIATPIFSKMAHWSNK
57>>GT28_SCHJA GLUTATHIONE S-TRANSFERASE 28 KD (EC 2.5.1.18
58; fa_initn: 190
59; fa_init1: 97
60; fa_opt: 169
61; fa_z-score: 217.9
62; fa_expect: 1.1e-05
63; sw_score: 169
64; sw_ident: 0.277
65; sw_overlap: 228
66>GT8.7  ..
67; sq_len: 217
68; sq_type: p
69; al_start: 4
70; al_stop: 180
71; al_display_start: 1
72PMILGYWNVRGLTHPIRMLLEYTDSSYDEKRYTMGDAPDFDRSQWLNEKF
73KLGLDFPNLPY--LID--GSHK-ITQSNAILRYLARKHHLDGETEEERIR
74ADIVENQVMDTRMQLIMLCYNPDFEKQK--PEFLK-TIPEKMKLYSEFLG
75KRP--WFAGDKVTYVDFLAYDILDQYRMFEPKCLDA-FPNLRDFLARFEG
76LKKISAYMKSSRYIATPIFSKMAHWSNK
77>GT28_SCHJA ..
78; sq_len: 206
79; sq_type: p
80; al_start: 3
81; al_stop: 180
82; al_display_start: 1
83-VKLIYFNGRGRAEPIRMILVAAGVEFEDERIEFQDWP----------KI
84KPTIPGGRLPIVKITDKRGDVKTMSESLAIARFIARKHNMMGDTDDEYYI
85IEKMIGQVEDVESEYHKTLIKPPEEKEKISKEILNGKVPILLQAICETLK
86ESTGNLTVGDKVTLADVVLIASIDHITDLDKEFLTGKYPEIHKHRKHLLA
87TSPKLAKYLSERHATAF
88>>GT2_DROME GLUTATHIONE S-TRANSFERASE 2 (EC 2.5.1.18).
89; fa_initn: 124
90; fa_init1: 124
91; fa_opt: 164
92; fa_z-score: 210.1
93; fa_expect: 2.9e-05
94; sw_score: 164
95; sw_ident: 0.248
96; sw_overlap: 251
97>GT8.7  ..
98; sq_len: 217
99; sq_type: p
100; al_start: 4
101; al_stop: 198
102; al_display_start: 1
103---------------------------PMILGYWNVRGLTHPIRMLLEYT
104DSSYDEKRYTMGDAPDFDRSQWLNEKFKLGLDFPNLPYL-IDGSHKITQS
105NAILRYLARKHHLDGETEEERIRADIVENQVMDTRMQLIMLCYNPDFEKQ
106KPEFLKTIPEKMKLYSEFLGKR-----PWFAGDKVTYVDFLAYDILDQYR
107-MFEPKCLDAFPNLRDFLARFEGLKKISAYMKSSRYIATPIFSKMAHWSN
108K
109>GT2_DROME ..
110; sq_len: 247
111; sq_type: p
112; al_start: 52
113; al_stop: 240
114; al_display_start: 22
115PPAEGAEGAVEGGEAAPPAEPAEPIKHSYTLFYFNVKALPSPC------A
116TCSDGNQEYE--DVAHPRRVPALKPTMPMG----QMPVLEVDGK-RVHQS
117ISMARFLAKTVGLCGATPWEDLQIDIVVDTINDFRLKIAVVSYEPEDEIK
118EKKLVTLNAEVIPFYLEKLEQTVKDNDGHLALGKLTWADVYFAGITDYMN
119YMVKRDLLEPYPAVRGVVDAVNALEPIKAWIEKRPVTEV
120
121
122Note that the parseable output starts with ">>>" and that each
123alignment record starts with ">>" while each aligned sequence record
124starts with ">"
125
126All parameters produced by the fasta package will be of the form:
127
128	; xx_yyyyy
129
130In this version, we have xx:
131
132	pg - program parameters (name, version, matrix)
133	fa - fasta scores, expect values, etc.
134	sw - Smith-Waterman scores, expect values.
135	sq - sequence length, type
136	al - alignment start, stop, display_offset
137
138Other FASTA distributors may choose to add additional fields.  If they
139do, they should use a tag with more than two characters, e.g.:
140
141	ebi_access:
142or
143	gcg_?????
144
145The FASTA tags will be limited to two characters followed by a "_".
146
147All of the output parameters correspond to values that are presented
148in other FASTA output formats, with the exception of the "al_"
149parameters.
150
151al_start gives the location of the alignment start in the
152	original sequence
153
154al_stop gives the location of the end of the alignment in the
155	original sequence
156
157al_display_start
158	gives the location of the first displayed amino acid residue
159	in the original sequence.  The -m 10 alignments are the same
160	as those produced in the other modes. In particular,
161	FASTA/SSEARCH provide some context for the alignment; if the
162	"-a" option is not used, FASTA/SSEARCH will try to provide
163	about 30 residues on either side of the actual local
164	alignment, if alignment is in the middle of one or the other
165	sequence.  If the begining of the query sequence aligns with
166	the 10'th residue of the library sequence, then the query
167	sequence will be padded with ten leading "-" to produce the
168	alignment.  The leading '-' are a formatting convenience only;
169	they are not considered in the numbering system for
170	al_display_start, al_start, or al_stop.
171
172	Thus:
173
174	>GT8.7  ..
175	; sq_len: 217
176	; sq_type: p
177	; al_start: 3
178	; al_stop: 180
179	; al_display_start: 1
180	---PMILGYWNVRGLTHPIRMLLEYTDSSYDEKRYTMGDAPDFDRSQWLN
181	EKFKLGLDFPNLPYLIDGSHKITQSNAILRYLARKHH---LDGETEEERI
182	RADIVENQVMDTRMQLIMLCYNPDFEKQKPEFLKTIPEKMKLYSEFLGKR
183	PWFAGDKVTYVDFLAYDILDQYRMFEPKCLDA------FPNLRDFLARFE
184	GLKKISAYMKSSRYIATPIFSKMAHWSNK
185	>ARP2_TOBAC ..
186	; sq_len: 223
187	; sq_type: p
188	; al_start: 6
189	; al_stop: 181
190	; al_display_start: 1
191	MAEVKLLGFW-YSPFSHRVEWALKIKGVKYE---YIEEDRD--NKSSLLL
192	QSNPV---YKKVPVLIHNGKPIVESMIILEYIDETFEGPSILPKDPYDRA
193	LARFWAKFLDDKVAAVVNTFFRKGEEQEKGK--EEVYEMLKVLDNELKDK
194	KFFAGDKFGFADIAANLVGFWLGVFEEGYGDVLVKSEKFPNFSKWRDEYI
195	NCSQVNESLPPRDELLAFFRARFQAVVASRSAPK
196
197 	Says that to align the two sequences, the first 'P' of GT8.7 must
198	line up with the first 'V' (residue 4) in ARP2_TOBAC but that
199	the actual best local alignment starts with the first 'I' in
200	GT8.7 and the first 'L' in ARP2_TOBAC.
201
202