Matras1.1 : Help Page for Multiple 3D Alignment

Help Page
"Multiple 3D Alignment"
A multiple 3D alignment compares several structures belonging to the same
superfamily, which provides important biological insight such as conserved sites or
conserved structural features. However, it is well known that the problem of
multiple sequence alignment is difficult to solve strictly, and that for 3D structures
must be much more difficult because of the multi-body properties of 3D structures.
To solve the problem within a reasonable computational time, we used the
progressive alignment algorithm, which is the most popular heuristics for multiple
sequence alignment. The progressive alignment consists of the following three
steps. (i) Calculate pairwise 3D alignments and similarities for all of the protein
pairs. (ii)Construct a guide tree using the R-score by the UPGMA
method. (iii)Starting from the leaf nodes of the guide tree, progressively align all of
the nodes, in order of decreasing similarity. For aligning a group to another group,
all of the protein pairs between the two groups are tried, and the best pairwise
alignment determines the alignment of the two groups. In other words, our
multiple 3D alignment is performed by assembling the results of the pairwise 3D
alignments in the proper order. Using our web server, a user can compare up to
five structures. It also shows the superimposed structures and a dendrogram of
structural similarities.
The following is an example of Multiple 3D Alignment.
Multiple 3D Alignment using Matras 1.1: 1mbd- 1ecd- 4hhbA 4hhbB
[pid] 11213
[Npro] 4
[pro0] 1mbd- ["" - ""]
[pro1] 1ecd- ["" - ""]
[pro2] 4hhbA ["" - ""]
[pro3] 4hhbB ["" - ""]
Now calculating. Wait a while.
[MULTIPLE ALIGNMENT]
[AVE_SECSTR] ccCCHHHHHHHHHHHhhhggttghhhhHHHHHHHHHHHcggggggCtttctcsshhhhhh
1ecd- --LSADQISTVQASFDKVKG------DPVGILYAVFKADPSIMAKFTQFAGK-DLESIKG:51
1mbd- -VLSEGEWQLVLHVWAKVEA--DVAGHGQDILIRLFKSHPETLEKFDRFKHLKTEAEMKA:57
4hhbA -VLSPADKTNVKAAWGKV--GAHAGEYGAEALERMFLSFPTTKTYFPHFD-L----S-HG:51
4hhbB VHLTPEEKSAVTALWGKV----NVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMG:56
.*:. ... * : :.** . . : ..* :.:.. * : .* * : . . .:
[AVE_SECSTR] cHHHHHHHHHHHHHHHHHHhTggghHHHhHHHHHHHhhttcCChHHHHHHHHHHHHHHHH
1ecd- TAPFETHANRIVGFFSKIIGELPNIEADVNTFVASHKP-RGVTHDQLNNFRAGFVSYMKA:110
1mbd- SEDLKKHGVTVLTALGAILKKKGHHEAELKPLAQSHATKHKIPIKYLEFISEAIIHVLHS:117
4hhbA SAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAA:111
4hhbB NPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAH:116
.. .: *:..:..:.. ....... .. . .:. .*. :. :.. .. ... .. .:..
[AVE_SECSTR] HcgggcchhhHHHHHHHHHHHHHHHhhhcchhtccc
1ecd- HTD--FA-GAEAAWGATLDTFFGMIFSKM-------:136
1mbd- RHPGDFGADAQGAMNKALELFRKDIAAKYKELGYQG:153
4hhbA HLPAEFTPAVHASLDKFLASVSTVLTSKYR------:141
4hhbB HFGKEFTPPVQAAYQKVVAGVANALAHKYH------:146
: . .*.. ..:: : :. . ...*:
3D VIEW:
[3D(image)] [3D(Chime-plugin)] [3D(rasmol)] [superimposed_pdb_file]
ALIGNMENT:
[ClustalW] [Sequence] [SecStr]
SIMILARITY:
[SimMatrix(pairwise)] [DRMS(all-align)] [SeqID(all-align)]
TREE:
[DRMS(all-aligned)] [SEQ(all-aligned)]
The row [AVE_SECSTR] is the average secondary structure of each site.
Secondary structures were assigned by DSSP program (Kabsh & Sander),
and completely conserved sites are shown in upper letters.
- 3D VIEW
We prepared three kinds of way for viewing superimposed structures:
image, Chime-plugin and rasmol external application. Details of them
are described in How to see 3D structure?.
The raw PDB file of superimposed structures can be download from
[superimposed_pdb_file].
A following picture is an example of [3D(image)],
which does not required any plug-ins.
View of Multiple 3D Alignment using Matras 1.1: 1mbd- 1ecd- 4hhbA 4hhbB
[IMAGE OF SUPERIMPOSED STRUCTURES]
1mbd- : "blue"
1ecd- : "red"
4hhbA : "green"
4hhbB : "cyan"
- ALIGNMENT
Users can download multiple alignment in three different formats.
The first one [ClustalW] is the Clustal W format, which is
the most popular program for sequence multiple alignment.
The second format [Sequence] is
the multiple alignment of amino acids with residue number and average secondary structure.
The third format [SecStr] is
the multiple alignment of secondary structures with residue number and average amino acids.
A following alignment is an example of the [SecStr] format.
In the [AVE_SEQUENCE] row, completely and halfly conserved sites are shown in upper and lower letters,
respectively.
[AVE_SEQUENCE] .vLsp.eks.V.a.wgKV.....v.e.g.eaL.rlfks.P.t..kF..F..l.t..s.kg
1ecd- 1:--ccHHHHHHHHHHHHTTTT------cHHHHHHHHHHHcHHHHTTcTTTTTS-cHHHHTT:51
1mbd- 1:-cccHHHHHHHHHHHHHHGG--GHHHHHHHHHHHHHHHcHHHHTTcTTTTTccSHHHHHH:57
4hhbA 1:-cccHHHHHHHHHHHHHH--TTTHHHHHHHHHHHHHHHcGGGGGGcTTSc-c----S-TT:51
4hhbB 1:ccccHHHHHHHHHHHTTc----cHHHHHHHHHHHHHHHSGGGGGGcGGGcccSSHHHHHH:56
[AVE_SEQUENCE] sa.vk.Hgkkvlgals.ilahldn.ea.l.tls.lHa.kl.vdp.nl.lls..lv.vlaa
1ecd- 52:SHHHHHHHHHHHHHHHHHHHTTTccHHHHHHHHHHHGG-GTccHHHHHHHHHHHHHHHHH:110
1mbd- 58:cHHHHHHHHHHHHHHHHHHTTTTccHHHHHHHHHHHHHTScccHHHHHHHHHHHHHHHHH:117
4hhbA 52:cHHHHHHHHHHHHHHHHHHHTGGGHHHHTHHHHHHHHHTTcccTHHHHHHHHHHHHHHHH:111
4hhbB 57:cHHHHHHHHHHHHHHHHHHTTGGGHHHHHHHHHHHHHHTTcccTHHHHHHHHHHHHHHHH:116
[AVE_SEQUENCE] h.p.eFtp.aqaa..k.la.f...iasKy.......
1ecd- 111:HSc--GG-GGHHHHHHHHHHHHHHHHHHc-------:136
1mbd- 118:HcGGGccHHHHHHHHHHHHHHHHHHHHHHHHHTccc:153
4hhbA 112:HcTTTccHHHHHHHHHHHHHHHHHHTTTcc------:141
4hhbB 117:HHGGGScHHHHHHHHHHHHHHHHHHHTTcc------:146
- SIMILARITY
All the similarities between proteins are shown in the matrix style.
[SimMatrix(pairwise)] shows various kinds of similarities
which are obtained from pairwise 3D alignments as follows:
#These similarities are obtained by the pairwise 3D alignments.
[RDIS(%)]
1ecd- 0.0 65.1 57.0 59.1
1mbd- 65.1 0.0 70.7 72.0
4hhbA 57.0 70.7 0.0 76.9
4hhbB 59.1 72.0 76.9 0.0
[RMS(A)] #for aligned Calpha atoms
1ecd- 0.000 1.652 2.397 2.252
1mbd- 1.652 0.000 1.560 1.617
4hhbA 2.397 1.560 0.000 1.451
4hhbB 2.252 1.617 1.451 0.000
[DRMS(A)] #for aligned Cbeta atoms
1ecd- 0.000 1.462 1.916 1.875
1mbd- 1.462 0.000 1.398 1.417
4hhbA 1.916 1.398 0.000 1.125
4hhbB 1.875 1.417 1.125 0.000
[SqID(%)]
1ecd- 100.0 20.6 18.3 19.1
1mbd- 20.6 100.0 27.0 24.8
4hhbA 18.3 27.0 100.0 43.9
4hhbB 19.1 24.8 43.9 100.0
Note that we cannot guarantee all the pair of alignments considered
the same number of sites. Therefore, these values may not have
"metric" property, such as triangle inequality. To draw a dendrogram,
we recommend to use [DRMS(all-align)] or
[SeqID(all-align)], which are calculated
by sites in whici all the proteins are aligned.
- TREE
A dendrogram is effective way to show similarities graphically.
We prepared two-kinds of dendrogram based on [DRMS(all-aligned)]
and [SEQ(all-aligned)]. Both are calculated from the
all aligned sites using the Neighbor Joining method. A following picture is an example of
[SEQ(all-aligned)]. A raw tree file (in phylip format) can be downloaded.
PROCESS_ID 11213
TREE_ID D
ConservedSites T
type DRMS
Algorithm NJ

[DistanceMatrix] [TreeFile(phylip format)]
Comments and Questions to :
Back to MATRAS HELP page
Back to MATRAS title page