1The \eslmod{msaweight} module implements different \emph{ad hoc} 2sequence weighting algorithms, for compensating for overrepresentation 3in multiple sequence alignments. 4 5A multiple alignment often includes similar, even identical copies of 6sequences; the same sequence is often deposited in the databases more 7than once, and sequences from several closely related species are 8usually available. Thus relying on raw residue frequencies observed in 9multiple alignments is a flawed strategy, just as Wittgenstein 10wouldn't trust a consensus of two copies of his morning paper. 11 12The functions in the \eslmod{msaweight} API are summarized in 13Table~\ref{tbl:msa_api}. 14 15% TODO: Should implement more algorithms. 16 17\begin{table}[hbp] 18\begin{center} 19{\small 20\begin{tabular}{|ll|}\hline 21\hyperlink{func:esl_msaweight_GSC()}{\ccode{esl\_msaweight\_GSC()}} & GSC weights.\\ 22\hyperlink{func:esl_msaweight_PB()}{\ccode{esl\_msaweight\_PB()}} & PB (position-based) weights.\\ 23\hyperlink{func:esl_msaweight_BLOSUM()}{\ccode{esl\_msaweight\_BLOSUM()}} & BLOSUM weights.\\ 24\hline 25\end{tabular} 26} 27\end{center} 28\caption{Functions in the \eslmod{msaweight} API. Requires the Easel core 29and phylogeny modules.} 30\label{tbl:msaweight_api} 31\end{table} 32 33\subsection{Example of using the msaweight API} 34 35An example of reading in a multiple alignment and calculating weights 36for its sequences using the GSC algorithm: 37 38\input{cexcerpts/msaweight_example} 39 40The new weights are stored internally in the \ccode{ESL\_MSA} object, 41and (as the example shows) can be accessed in its array 42\ccode{msa->wgt[0..nseq-1]}. 43 44\subsection{Pros and cons of different algorithms} 45 46% TODO: Computational complexity 47 48% TODO: Figures showing time, memory for varying N, L. 49 50% TODO: Eventually, benchmarks on HMMER: are these methods actually 51% different? 52 53 54