1.. _annotate:
2
3###############
4*annotate*
5###############
6``bedtools annotate``, well, annotates one BED/VCF/GFF file with the coverage
7and number of overlaps observed from multiple other BED/VCF/GFF files.
8In this way, it allows one to ask to what degree one feature coincides with
9multiple other feature types with a single command.
10
11==========================================================================
12Usage and option summary
13==========================================================================
14**Usage**:
15::
16
17  bedtools annotate [OPTIONS] -i <BED/GFF/VCF> -files FILE1 FILE2 FILE3 ... FILEn
18
19**(or)**:
20::
21
22  annotateBed [OPTIONS] -i <BED/GFF/VCF> -files FILE1 FILE2 FILE3 ... FILEn
23
24
25===========================      ===============================================================================================================================================================================================================
26 Option                           Description
27===========================      ===============================================================================================================================================================================================================
28**-names**				         A list of names (one per file) to describe each file in -i. These names will be printed as a header line.
29**-counts**					     Report the count of features in each file that overlap -i. Default behavior is to report the fraction of -i covered by each file.
30**-both**                        Report the count of features followed by the % coverage for each annotation file. Default is to report solely the fraction of -i covered by each file.
31**-s**                           Force strandedness. That is, only include hits in A that overlap B on the same strand. By default, hits are included without respect to strand.
32**-S**	                         Require different strandedness.  That is, only report hits in B that overlap A on the _opposite_ strand. By default, overlaps are reported without respect to strand.
33===========================      ===============================================================================================================================================================================================================
34
35==========================================================================
36Default behavior - annotate one file with coverage from others.
37==========================================================================
38By default, the fraction of each feature covered by each annotation file is
39reported after the complete feature in the file to be annotated.
40
41.. code-block:: bash
42
43  $ cat variants.bed
44  chr1 100  200   nasty 1  -
45  chr2 500  1000  ugly  2  +
46  chr3 1000 5000  big   3  -
47
48  $ cat genes.bed
49  chr1 150  200   geneA 1  +
50  chr1 175  250   geneB 2  +
51  chr3 0    10000 geneC 3  -
52
53  $ cat conserve.bed
54  chr1 0    10000 cons1 1  +
55  chr2 700  10000 cons2 2  -
56  chr3 4000 10000 cons3 3  +
57
58  $ cat known_var.bed
59  chr1 0    120   known1   -
60  chr1 150  160   known2   -
61  chr2 0    10000 known3   +
62
63  $ bedtools annotate -i variants.bed -files genes.bed conserve.bed known_var.bed
64  chr1	100	200	nasty	1	-	0.500000	1.000000	0.300000
65  chr2	500	1000	ugly	2	+	0.000000	0.600000	1.000000
66  chr3	1000	5000	big	3	-	1.000000	0.250000	0.000000
67
68
69==========================================================================
70``-count`` Report the count of hits from the annotation files
71==========================================================================
72
73.. code-block:: bash
74
75  $ bedtools annotate -counts -i variants.bed -files genes.bed conserve.bed known_var.bed
76  chr1	100	200	nasty	1	-	2	1	2
77  chr2	500	1000	ugly	2	+	0	1	1
78  chr3	1000	5000	big	3	-	1	1	0
79
80
81
82===========================================================================================
83``-both`` Report both the count of hits and the fraction covered from the annotation files
84===========================================================================================
85
86.. code-block:: bash
87
88  $ bedtools annotate -both -i variants.bed -files genes.bed conserve.bed known_var.bed
89  #chr	start	end	name	score	+/-	cnt1	pct1	cnt2	pct2	cnt3	pct3
90  chr1	100	200	nasty	1	-	2	0.500000	1	1.000000	2	0.300000
91  chr2	500	1000	ugly	2	+	0	0.000000	1	0.600000	1	1.000000
92  chr3	1000	5000	big	3	-	1	1.000000	1	0.250000	0	0.000000
93
94
95
96
97==========================================================================
98``-s`` Restrict the reporting to overlaps on the **same** strand.
99==========================================================================
100
101.. code-block:: bash
102
103  $ bedtools annotate -s -i variants.bed -files genes.bed conserve.bed known_var.bed
104  chr1	100	200	nasty	1	-	0.000000	0.000000	0.000000
105  chr2	500	1000	ugly	2	+	0.000000	0.000000	0.000000
106  chr3	1000	5000	big	3	-	1.000000	0.000000	0.000000
107
108
109
110==========================================================================
111``-S`` Restrict the reporting to overlaps on the **opposite** strand.
112==========================================================================
113
114.. code-block:: bash
115
116  $ bedtools annotate -S -i variants.bed -files genes.bed conserve.bed known_var.bed
117  chr1	100	200	nasty	1	-	0.500000	1.000000	0.300000
118  chr2	500	1000	ugly	2	+	0.000000	0.600000	1.000000
119  chr3	1000	5000	big	3	-	0.000000	0.250000	0.000000
120
121