Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

Share this Page URL

Chapter 3. Sequence Alignment > Variations - Pg. 51

Algorithmic Complexity The complexity of algorithms is often described in big-O notation, a shorthand for the asymptotic behavior of the algorithm. For example, searching for a name in a phone book by starting at the beginning and going name-by-name takes on average, n/2 op- erations, where n is the number of names in the phone book. Such a search has O(n) time complexity (constants are dropped from the notation). It scales linearly in time; a phone book twice as long takes twice as long to search. An approach that scales more efficiently successively splits the phone book in half based on the alphabetical order. This is called a binary search and has O(log 2 n) complexity. For example, a phone book eight times longer takes only three times longer to search. The alignment algorithms as described have O(nm) complexity in both time and memory, where n and m are the lengths of the sequences. It's also common to label such algorithms as O(n2) and speak of them as scaling quadratically. Global Versus Local Although global and local alignment are mechanistically similar, they have very differ- ent properties. Consider the alignment between the genomic sequence of two eukary- otic genes from distantly related organisms. You'd expect the exons to remain the same because their coding sequences are evolutionarily constrained, but the introns may no longer be recognizably similar, especially if they have acquired many insertions or de- letions. The problem is that exons may account for only 1 to 2 percent of the sequence. As a result, a global alignment between these sequences is an alignment of mostly ran- dom letters. In such a scenario, it's very likely (especially if introns change size, as they often do) that the exons will not align to one another because their score contribution is very small compared to the rest of the sequence. In contrast, local alignment can pick out conserved exons, but unfortunately, just the maximum scoring one. The short- comings of the standard alignment algorithms have been addressed by numerous var- iants. Variations The algorithms as described and implemented earlier are rarely used. Over the years, important innovations have made the general algorithms more applicable to aligning biological sequences and running efficiently in a computer. Gap Modifications Most alignment algorithms employ affine gaps. The previous description used a simple gap-scoring scheme in which every letter inserted or deleted cost the same amount. This isn't a very good model of biology because large gaps tend to occur fairly often. Variations | 51