A Fine-Grained Structure Clustering Method


What is T-RMSD?

T-RMSD (Tree based on Root Mean Square Deviation) is a distance RMSD (dRMSD) based method which generate Structural Trees (analogue to phylogenetic tree) to determine fine-grained structural variations associated with a given Multiple Sequence Alignment (generated by the user using the method of is choice). It is part of the T-Coffee distribution. The specificity of T-RMSD compared to other structural comparison methods stems from its capacity to generate a structural tree with values equivalent bootstrap values supporting the structural clustering. Such clustering is achieved by the construction of matrixes of distances, calculated between equivalent residues as defined by the ungapped column of the given MSA. The resulting matrixes are then combined using the CONSENSE program from the PHYLIP package to generate a consensus structural tree with equivalent bootstrap values supporting each node(from 0 to 100; 100 indicating that all positions support the clustering).

A list of related publications is available here and there. The following presentation provides an overview of the algorithm and its validation.



  • T-RMSD is a special option allowing a fine-grained structural clustering of a given group of related proteins (i.e. protein domains). Download the latest T-Coffee version here.


1-Follow the standard T-Coffee installation procedure.

2-Install all the multiple sequence alignment packages you want to construct your Multiple Sequence Alignment. You should also take a look at the T-Coffee Technical Documentation. The latest list of packages interfaced with T-Coffee can also be obtained with the simple intruction t_coffee > package_list


Given a Multiple Sequence Alignment foo.aln and a template file foo.template_list (similar to Expresso procedure; it corresponds to the association between sequences contained in the MSA and the PDB files used for the distances calculation) T-RMSD outputs a file named foo.html that contains a color coded indication of the ungapped column supporting the structural clustering. Gapped columns are not colored as they are not used to establish the clustering. Red brick columns indicate positions with a strong support to the clustering, while blue regions exhibit a low support. Red brick columns represent the most structurally discriminating positions. Unlike alginments generated using T-Coffee, there is no consistency score (0-100) for each sequence of the MSA. T-RMSD also outputs several files named foo.struc_tree10, foo.struc_tree50, foo.struc_tree100 which corresponding to Neighbour Joining files, that can be used to build the structural tree using the software of your choice. The indication 10, 50 or 100 represent the pourcentage of ungapped columns used to generate the clustering.


The full documentation is on the T-Coffee Homepage. But the following shortcuts may be useful.

To run a T-RMSD from a given MSA, use the following
t_coffee -other_pg trmsd foo.aln -template_file foo.template_list -mode msa2tree
Sample Datasets are available from: here


  • Dali server Dali server v3.0
  • CATH database CATH homepage v3.3
  • SCOP server SCOP homepage 1.75 release


    Our projects relie on your feeback. Please send me an E-mail if you wish to make a request, a comment, or report a bug!

    Dr. Cedric Notredame, PhD.
    Group Leader
    Comparative Bioinformatics Group
    Bioinformatics and Genomics Programme
    Center for Genomic Regulation (CRG)
    Dr Aiguader, 88
    08003 Barcelona
    Email: cedric.notredame@europe.com
    HOME : http://www.tcoffee.org/homepage.html
    Phone: +34 933 160 271