Good profiles and what they are good for.

Philipp Bucher

Sequence family based similarity search methods such as profiles are claimed to be more effective in detecting distant homologies between proteins than single-query based methods such as the Smith-Waterman algorithm. This claim is supported by benchmarks. I'm hypothesizing that profiles may outperform pairwise methods also with regard to other criteria such as (i) reliable subfamily classification, (ii) accurate domain delineation, and (iii) generation of better alignments for homology-based 3D structure modeling. This has not yet been demonstrated to be true, corresponding benchmarking protocols first need to be developed. I'm further arguing that sensitivity and the other above mentioned performance criteria are conflicting objectives and that different profiles for the same protein family should be used for gene discovery, automatic sequence annotation, and homology-based 3D structure modeling, respectively.