• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

.github/ISSUE_TEMPLATE/H28-Nov-2020-4535

bin/H28-Nov-2020-21

contrib/H28-Nov-2020-14,20611,141

doc/H03-May-2022-2,0651,732

example/H03-May-2022-373,096373,092

scripts/H28-Nov-2020-3223

src/H03-May-2022-29,84022,907

test/H28-Nov-2020-4,2914,086

.gitignoreH A D28-Nov-2020163 1514

.guix-devH A D28-Nov-2020152 32

.guix-dev-gcc-olderH A D28-Nov-2020158 32

.travis.ymlH A D28-Nov-2020807 3332

INSTALL.mdH A D28-Nov-20203.9 KiB13375

LICENSEH A D28-Nov-202031.7 KiB622513

MakefileH A D28-Nov-20206.4 KiB262161

Makefile.linuxH A D28-Nov-20203 KiB11162

Makefile.macosxH A D28-Nov-20203.9 KiB15485

README.mdH A D28-Nov-202011.6 KiB295214

RELEASE-NOTES.mdH A D28-Nov-20208.8 KiB248181

VERSIONH A D28-Nov-20207 21

run_tests.shH A D28-Nov-2020178 73

README.md

1![Genetic associations identified in CFW mice using GEMMA (Parker et al,
2Nat. Genet., 2016)](cfw.gif)
3
4# GEMMA: Genome-wide Efficient Mixed Model Association
5
6[![Build Status](https://travis-ci.org/genetics-statistics/GEMMA.svg?branch=master)](https://travis-ci.org/genetics-statistics/GEMMA) [![Anaconda-Server Badge](https://anaconda.org/bioconda/gemma/badges/installer/conda.svg)](https://anaconda.org/bioconda/gemma) [![DL](https://anaconda.org/bioconda/gemma/badges/downloads.svg)](https://anaconda.org/bioconda/gemma) [![BrewBadge](https://img.shields.io/badge/%F0%9F%8D%BAbrew-gemma--0.98-brightgreen.svg)](https://github.com/brewsci/homebrew-bio) [![GuixBadge](https://img.shields.io/badge/gnuguix-gemma-brightgreen.svg)](https://www.gnu.org/software/guix/packages/G/) [![DebianBadge](https://badges.debian.net/badges/debian/testing/gemma/version.svg)](https://packages.debian.org/testing/gemma)
7
8GEMMA is a software toolkit for fast application of linear mixed
9models (LMMs) and related models to genome-wide association studies
10(GWAS) and other large-scale data sets.
11
12Check out [NEWS.md](NEWS.md) to see what's new in each GEMMA release.
13
14Please post feature requests or suspected bugs to
15[Github issues](https://github.com/genetics-statistics/GEMMA/issues). For
16questions or other discussion, please post to the
17[GEMMA Google Group](https://groups.google.com/group/gemma-discussion). We
18also encourage contributions, for example, by forking the repository,
19making your changes to the code, and issuing a pull request.
20
21Currently, GEMMA is supported for 64-bit Mac OS X and Linux
22platforms. *Windows is not currently supported.* though you can
23run GEMMA in a Linux VM or [container](https://docs.docker.com/docker-for-windows/). If you are interested
24in helping to make GEMMA available on Windows platforms (e.g., by
25providing installation instructions for Windows, or by contributing
26Windows binaries) please post a note in the
27[Github issues](https://github.com/genetics-statistics/GEMMA/issues).
28
29*(The above image depicts physiological and behavioral trait
30loci identified in CFW mice using GEMMA, from [Parker et al, Nature
31Genetics, 2016](https://doi.org/10.1038/ng.3609).)
32
33* [Key features](#key-features)
34* [Installation](#installation)
35  * [Precompiled binaries](#precompiled-binaries)
36* [Run GEMMA](#run-gemma)
37  * [Debugging and optimization](#debugging-and-optimization)
38* [Help](#help)
39* [Citing GEMMA](#citing-gemma)
40* [License](#license)
41* [Optimizing performance](#optimizing-performance)
42* [Building from source](#building-from-source)
43* [Input data formats](#input-data-formats)
44* [Reporting a GEMMA bug or issue](#reporting-a-gemma-bug-or-issue)
45  * [Check list:](#check-list)
46* [Code of conduct](#code-of-conduct)
47* [Credits](#credits)
48
49## Key features
50
511. Fast assocation tests implemented using the univariate linear mixed
52model (LMM). In GWAS, this can correct for population structure and
53sample nonexchangeability. It also provides estimates of the
54proportion of variance in phenotypes explained by available genotypes
55(PVE), often called "chip heritability" or "SNP heritability".
56
572. Fast association tests for multiple phenotypes implemented using a
58multivariate linear mixed model (mvLMM). In GWAS, this can correct for
59populations tructure and sample nonexchangeability jointly in multiple
60complex phenotypes.
61
623. Bayesian sparse linear mixed model (BSLMM) for estimating PVE,
63phenotype prediction, and multi-marker modeling in GWAS.
64
654. Estimation of variance components ("chip heritability") partitioned
66by different SNP functional categories from raw (individual-level)
67data or summary data. For raw data, HE regression or the REML AI
68algorithm can be used to estimate variance components when
69individual-level data are available. For summary data, GEMMA uses the
70MQS algorithm to estimate variance components.
71
72## Installation
73
74To install GEMMA you can
75
761. Download the precompiled binaries (64-bit Linux and Mac only)
77
782. Use existing package managers, see [INSTALL.md](INSTALL.md).
79
803. Compile GEMMA from source, see [INSTALL.md](INSTALL.md).
81
82Compiling from source takes more work, but can potentially boost
83performance of GEMMA when using specialized C++ compilers and
84numerical libraries.
85
86### Precompiled binaries
87
881. Fetch the [latest stable release][latest_release] and download the
89   file appropriate for your platform.
90
912. For .tar.bz2 files unpack the tar ball
92
93        tar xvjf gemma-$version-installer.tar.bz2
94
95    run the installer
96
97        ./install.sh ~/gemma
98
99    and run gemma
100
101        ~/gemma/bin/gemma
102
1033. For .gz files run `gunzip gemma.linux.gz` or `gunzip
104gemma.linux.gz` to unpack the file.
105
106## Run GEMMA
107
108GEMMA is run from the command line. To run gemma
109
110```sh
111gemma -h
112```
113
114a typical example would be
115
116```sh
117# compute Kinship matrix
118gemma -g ../example/mouse_hs1940.geno.txt.gz -p ../example/mouse_hs1940.pheno.txt \
119    -gk -o mouse_hs1940
120# run univariate LMM
121gemma -g ../example/mouse_hs1940.geno.txt.gz \
122    -p ../example/mouse_hs1940.pheno.txt -n 1 -a ../example/mouse_hs1940.anno.txt \
123    -k ./output/mouse_hs1940.cXX.txt -lmm -o mouse_hs1940_CD8_lmm
124```
125
126Above example files can be downloaded from
127[github](https://github.com/genetics-statistics/GEMMA/tree/master/example).
128
129### Debugging and optimization
130
131GEMMA has a wide range of debugging options which can be viewed with
132
133```
134gemma -h 14
135
136 DEBUG OPTIONS
137 -check                   enable checks (slower)
138 -no-fpe-check            disable hardware floating point checking
139 -strict                  strict mode will stop when there is a problem
140 -silence                 silent terminal display
141 -debug                   debug output
142 -debug-data              debug data output
143 -legacy                  run gemma in legacy mode
144```
145
146typically when running gemma you should use -debug which includes relevant
147checks.
148
149For performances you may want to use the -no-check option
150instead. Also check the build optimization notes in
151[INSTALL.md](INSTALL.md).
152
153## Help
154
155+ [The GEMMA manual](doc/manual.pdf).
156
157+ [Detailed example with HS mouse data](example/demo.txt).
158
159+ [Tutorial on GEMMA for genome-wide association
160analysis](https://github.com/rcc-uchicago/genetic-data-analysis-2).
161
162## Citing GEMMA
163
164If you use GEMMA for published work, please cite our paper:
165
166+ Xiang Zhou and Matthew Stephens (2012). [Genome-wide efficient
167mixed-model analysis for association studies.](http://doi.org/10.1038/ng.2310)
168*Nature Genetics* **44**, 821–824.
169
170If you use the multivariate linear mixed model (mvLMM) in your
171research, please cite:
172
173+ Xiang Zhou and Matthew Stephens (2014). [Efficient multivariate linear
174mixed model algorithms for genome-wide association
175studies.](http://doi.org/10.1038/nmeth.2848)
176*Nature Methods* **11**, 407–409.
177
178If you use the Bayesian sparse linear mixed model (BSLMM), please cite:
179
180+ Xiang Zhou, Peter Carbonetto and Matthew Stephens (2013). [Polygenic
181modeling with bayesian sparse linear mixed
182models.](http://doi.org/10.1371/journal.pgen.1003264) *PLoS Genetics*
183**9**, e1003264.
184
185And if you use of the variance component estimation using summary
186statistics, please cite:
187
188+ Xiang Zhou (2016). [A unified framework for variance component
189estimation with summary statistics in genome-wide association
190studies.](https://doi.org/10.1101/042846) *Annals of Applied Statistics*, in press.
191
192## License
193
194Copyright (C) 2012–2018, Xiang Zhou and team.
195
196The *GEMMA* source code repository is free software: you can
197redistribute it under the terms of the
198[GNU General Public License](http://www.gnu.org/licenses/gpl.html). All
199the files in this project are part of *GEMMA*. This project is
200distributed in the hope that it will be useful, but **without any
201warranty**; without even the implied warranty of **merchantability or
202fitness for a particular purpose**. See file [LICENSE](LICENSE) for
203the full text of the license.
204
205Both the source code for the
206[gzstream zlib wrapper](http://www.cs.unc.edu/Research/compgeom/gzstream/)
207and [shUnit2](https://github.com/genenetwork/shunit2) unit testing
208framework included in GEMMA are distributed under the
209[GNU Lesser General Public License](contrib/shunit2-2.0.3/doc/LGPL-2.1),
210either version 2.1 of the License, or (at your option) any later
211revision.
212
213The source code for the included [Catch](http://catch-lib.net) unit
214testing framework is distributed under the
215[Boost Software Licence version 1](https://github.com/philsquared/Catch/blob/master/LICENSE.txt).
216
217## Optimizing performance
218
219Precompiled binaries and libraries may not be optimal for your particular
220hardware. See [INSTALL.md](INSTALL.md) for speeding up tips.
221
222## Building from source
223
224More information on source code, dependencies and installation can be
225found in [INSTALL.md](INSTALL.md).
226
227## Input data formats
228
229Currently GEMMA takes two types of input formats
230
2311. BIMBAM format (preferred)
2322. PLINK format
233
234See this [example](./doc/example/data-munging.org) where we convert some
235spreadsheets for use in GEMMA.
236
237## Reporting a GEMMA bug or issue
238
239For bugs GEMMA has an
240[issue tracker](https://github.com/genetics-statistics/GEMMA/issues)
241on github. For general support GEMMA has a mailing list at
242[gemma-discussion](https://groups.google.com/forum/#!forum/gemma-discussion)
243
244Before posting an issue search the issue tracker and mailing list
245first. It is likely someone may have encountered something
246similiar. Also try running the latest version of GEMMA to make sure it
247has not been fixed already. Support/installation questions should be
248aimed at the mailing list - it is the best resource to get answers.
249
250The issue tracker is specifically meant for development issues around
251the software itself. When reporting an issue include the output of the
252program and the contents of the .log.txt file in the output directory.
253
254### Check list:
255
2561. [X] I have found an issue with GEMMA
2572. [ ] I have searched for it on the [issue tracker](https://github.com/genetics-statistics/GEMMA/issues?q=is%3Aissue) (incl. closed issues)
2583. [ ] I have searched for it on the [mailing list](https://groups.google.com/forum/#!forum/gemma-discussion)
2594. [ ] I have tried the latest [release](https://github.com/genetics-statistics/GEMMA/releases) of GEMMA
2605. [ ] I have read and agreed to below code of conduct
2616. [ ] If it is a support/install question I have posted it to the [mailing list](https://groups.google.com/forum/#!forum/gemma-discussion)
2627. [ ] If it is software development related I have posted a new issue on the [issue tracker](https://github.com/genetics-statistics/GEMMA/issues) or added to an existing one
2638. [ ] In the message I have included the output of my GEMMA run
2649. [ ] In the message I have included the relevant .log.txt file in the output directory
26510. [ ] I have made available the data to reproduce the problem (optional)
266
267To find bugs the GEMMA software developers may ask to install a
268development version of the software. They may also ask you for your
269data and will treat it confidentially.  Please always remember that
270GEMMA is written and maintained by volunteers with good
271intentions. Our time is valuable too. By helping us as much as
272possible we can provide this tool for everyone to use.
273
274## Code of conduct
275
276By using GEMMA and communicating with its communtity you implicitely
277agree to abide by the
278[code of conduct](https://software-carpentry.org/conduct/) as
279published by the Software Carpentry initiative.
280
281## Credits
282
283The *GEMMA* software was developed by:
284
285[Xiang Zhou](http://www.xzlab.org)<br>
286Dept. of Biostatistics<br>
287University of Michigan<br>
288
289Peter Carbonetto, Tim Flutre, Matthew Stephens,
290[Pjotr Prins](http://thebird.nl/) and
291[others](https://github.com/genetics-statistics/GEMMA/graphs/contributors)
292have also contributed to the development of this software.
293
294[latest_release]: https://github.com/genetics-statistics/GEMMA/releases "Most recent stable releases"
295