1The \eslmod{msaweight} module implements different \emph{ad hoc}
2sequence weighting algorithms, for compensating for overrepresentation
3in multiple sequence alignments.
4
5A multiple alignment often includes similar, even identical copies of
6sequences; the same sequence is often deposited in the databases more
7than once, and sequences from several closely related species are
8usually available. Thus relying on raw residue frequencies observed in
9multiple alignments is a flawed strategy, just as Wittgenstein
10wouldn't trust a consensus of two copies of his morning paper.
11
12The functions in the \eslmod{msaweight} API are summarized in
13Table~\ref{tbl:msa_api}.
14
15% TODO: Should implement more algorithms.
16
17\begin{table}[hbp]
18\begin{center}
19{\small
20\begin{tabular}{|ll|}\hline
21\hyperlink{func:esl_msaweight_GSC()}{\ccode{esl\_msaweight\_GSC()}} & GSC weights.\\
22\hyperlink{func:esl_msaweight_PB()}{\ccode{esl\_msaweight\_PB()}} & PB (position-based) weights.\\
23\hyperlink{func:esl_msaweight_BLOSUM()}{\ccode{esl\_msaweight\_BLOSUM()}} & BLOSUM weights.\\
24\hline
25\end{tabular}
26}
27\end{center}
28\caption{Functions in the \eslmod{msaweight} API. Requires the Easel core
29and phylogeny modules.}
30\label{tbl:msaweight_api}
31\end{table}
32
33\subsection{Example of using the msaweight API}
34
35An example of reading in a multiple alignment and calculating weights
36for its sequences using the GSC algorithm:
37
38\input{cexcerpts/msaweight_example}
39
40The new weights are stored internally in the \ccode{ESL\_MSA} object,
41and (as the example shows) can be accessed in its array
42\ccode{msa->wgt[0..nseq-1]}.
43
44\subsection{Pros and cons of different algorithms}
45
46% TODO: Computational complexity
47
48% TODO: Figures showing time, memory for varying N, L.
49
50% TODO: Eventually, benchmarks on HMMER: are these methods actually
51% different?
52
53
54