1.. _annotate: 2 3############### 4*annotate* 5############### 6``bedtools annotate``, well, annotates one BED/VCF/GFF file with the coverage 7and number of overlaps observed from multiple other BED/VCF/GFF files. 8In this way, it allows one to ask to what degree one feature coincides with 9multiple other feature types with a single command. 10 11========================================================================== 12Usage and option summary 13========================================================================== 14**Usage**: 15:: 16 17 bedtools annotate [OPTIONS] -i <BED/GFF/VCF> -files FILE1 FILE2 FILE3 ... FILEn 18 19**(or)**: 20:: 21 22 annotateBed [OPTIONS] -i <BED/GFF/VCF> -files FILE1 FILE2 FILE3 ... FILEn 23 24 25=========================== =============================================================================================================================================================================================================== 26 Option Description 27=========================== =============================================================================================================================================================================================================== 28**-names** A list of names (one per file) to describe each file in -i. These names will be printed as a header line. 29**-counts** Report the count of features in each file that overlap -i. Default behavior is to report the fraction of -i covered by each file. 30**-both** Report the count of features followed by the % coverage for each annotation file. Default is to report solely the fraction of -i covered by each file. 31**-s** Force strandedness. That is, only include hits in A that overlap B on the same strand. By default, hits are included without respect to strand. 32**-S** Require different strandedness. That is, only report hits in B that overlap A on the _opposite_ strand. By default, overlaps are reported without respect to strand. 33=========================== =============================================================================================================================================================================================================== 34 35========================================================================== 36Default behavior - annotate one file with coverage from others. 37========================================================================== 38By default, the fraction of each feature covered by each annotation file is 39reported after the complete feature in the file to be annotated. 40 41.. code-block:: bash 42 43 $ cat variants.bed 44 chr1 100 200 nasty 1 - 45 chr2 500 1000 ugly 2 + 46 chr3 1000 5000 big 3 - 47 48 $ cat genes.bed 49 chr1 150 200 geneA 1 + 50 chr1 175 250 geneB 2 + 51 chr3 0 10000 geneC 3 - 52 53 $ cat conserve.bed 54 chr1 0 10000 cons1 1 + 55 chr2 700 10000 cons2 2 - 56 chr3 4000 10000 cons3 3 + 57 58 $ cat known_var.bed 59 chr1 0 120 known1 - 60 chr1 150 160 known2 - 61 chr2 0 10000 known3 + 62 63 $ bedtools annotate -i variants.bed -files genes.bed conserve.bed known_var.bed 64 chr1 100 200 nasty 1 - 0.500000 1.000000 0.300000 65 chr2 500 1000 ugly 2 + 0.000000 0.600000 1.000000 66 chr3 1000 5000 big 3 - 1.000000 0.250000 0.000000 67 68 69========================================================================== 70``-count`` Report the count of hits from the annotation files 71========================================================================== 72 73.. code-block:: bash 74 75 $ bedtools annotate -counts -i variants.bed -files genes.bed conserve.bed known_var.bed 76 chr1 100 200 nasty 1 - 2 1 2 77 chr2 500 1000 ugly 2 + 0 1 1 78 chr3 1000 5000 big 3 - 1 1 0 79 80 81 82=========================================================================================== 83``-both`` Report both the count of hits and the fraction covered from the annotation files 84=========================================================================================== 85 86.. code-block:: bash 87 88 $ bedtools annotate -both -i variants.bed -files genes.bed conserve.bed known_var.bed 89 #chr start end name score +/- cnt1 pct1 cnt2 pct2 cnt3 pct3 90 chr1 100 200 nasty 1 - 2 0.500000 1 1.000000 2 0.300000 91 chr2 500 1000 ugly 2 + 0 0.000000 1 0.600000 1 1.000000 92 chr3 1000 5000 big 3 - 1 1.000000 1 0.250000 0 0.000000 93 94 95 96 97========================================================================== 98``-s`` Restrict the reporting to overlaps on the **same** strand. 99========================================================================== 100 101.. code-block:: bash 102 103 $ bedtools annotate -s -i variants.bed -files genes.bed conserve.bed known_var.bed 104 chr1 100 200 nasty 1 - 0.000000 0.000000 0.000000 105 chr2 500 1000 ugly 2 + 0.000000 0.000000 0.000000 106 chr3 1000 5000 big 3 - 1.000000 0.000000 0.000000 107 108 109 110========================================================================== 111``-S`` Restrict the reporting to overlaps on the **opposite** strand. 112========================================================================== 113 114.. code-block:: bash 115 116 $ bedtools annotate -S -i variants.bed -files genes.bed conserve.bed known_var.bed 117 chr1 100 200 nasty 1 - 0.500000 1.000000 0.300000 118 chr2 500 1000 ugly 2 + 0.000000 0.600000 1.000000 119 chr3 1000 5000 big 3 - 0.000000 0.250000 0.000000 120 121