Abstract
MOTIVATION: In molecular biology, sequence alignment is a crucial tool in studying the structure and function of molecules, as well as the evolution of species. In the segment-to-segment variation of the multiple alignment problem, the input can be seen as a set of non-gapped segment pairs (diagonals). Given a weight function that assigns a weight score to every possible diagonal, the goal is to choose a consistent set of diagonals of maximum weight. We show that the segment-to-segment multiple alignment problem is equivalent to a novel formulation of the Maximum Trace problem: the Generalized Maximum Trace (GMT) problem. Solving this problem to optimality, therefore, may improve upon the previous greedy strategies that are used for solving the segment-to-segment multiple sequence alignment problem. We show that the GMT can be stated in terms of an integer linear program and then solve the integer linear program using methods from polyhedral combinatorics. This leads to a branch-and-cut algorithm for segment-to-segment multiple sequence alignment. RESULTS: We report on our first computational experiences with this novel method and show that the program is able to find optimal solutions for real-world test examples.