• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..14-Oct-2021-

ProteinGTRFit.bfH A D14-Oct-202114.4 KiB290212

ProteinGTRFit_helper.ibfH A D14-Oct-202124.3 KiB450342

README_relative_prot_rates.mdH A D14-Oct-20211.9 KiB3925

plusF_helper.ibfH A D14-Oct-20211.8 KiB5748

relative_prot_rates.bfH A D14-Oct-202115.6 KiB331227

README_relative_prot_rates.md

1## Site-wise relative rate estimator for protein multiple sequence alignments
2
3_Written by Sergei L Kosakovsky Pond [spond@temple.edu] and Stephanie J Spielman_
4
5> 2017-01-25: v0.1. Initial release.
6
7
8### Motivation
9
10This analysis performs a "non-paramteric" estimation of site-level substitution rates in a multiple sequence alignment of protein sequences. This allows one to evaluate levels of substitutional rate heterogeneity and, by extension, conservation. This analysis is based on the ["Rate4Site" method](http://www.tau.ac.il/~itaymay/cp/rate4site.html).
11
12### Analysis workflow
13
14#### Input
15
161. A protein sequence alignment (**file**)
172. A phylogenetic tree
18
19#### Output
20
211. **standard output**: a MarkDown file (see sample.md)
222. **file.json**: a JSON object (see sample.json), which contains the following keys
23	* `Relative site rate estimates`: for each site, a record like <pre>"1":{
24     "LB":0.9712850593725352,
25     "MLE":1.343044028469595,
26     "UB":1.821044718637831
27    }</pre> is provided. **MLE** is the point estimate of the relative rate at the site, **UB** and **LB** are the upper and lower bounds, respectively, of the profile likelihood confidence interval.
28    * `alignment`: file path for the alignment used to infer the rates, e.g. _/Users/sergei/Dropbox/Work/Collaborations/rates4sites/sim178.fasta_
29    * `analysis`: an object describing the version of the analysis run
30
31#### Procedure
32
331. Fit a protein model of sequence evolution to the entire alignment to obtain branch lengths.
342. For each site, fixing all the other model parameters, estimate site level scaler for branch lengths: **r<sub>i</sub>**, i.e. for site **i**, the following relationship holds, for each branch **b**: **length(b | data @ site i) = r<sub>i</sub> length (b | entire alignment)**
353. The MLE for **r<sub>i</sub>**, along with a profile likelihood confidence interval, is obtained.
36
37#### Features
38
39* MPI Enabled