1############### 2Example usage 3############### 4 5Below are several examples of basic bedtools usage. Example BED files are 6provided in the /data directory of the bedtools distribution. 7 8 9 10========================================================================== 11bedtools intersect 12========================================================================== 13 14 15Report the base-pair overlap between sequence alignments and genes. 16 17.. code-block:: bash 18 19 bedtools intersect -a reads.bed -b genes.bed 20 21 22 23Report whether each alignment overlaps one or more genes. If not, the alignment is not reported. 24 25.. code-block:: bash 26 27 bedtools intersect -a reads.bed -b genes.bed -u 28 29 30 31Report those alignments that overlap NO genes. Like "grep -v" 32 33.. code-block:: bash 34 35 bedtools intersect -a reads.bed -b genes.bed -v 36 37 38Report the number of genes that each alignment overlaps. 39 40.. code-block:: bash 41 42 bedtools intersect -a reads.bed -b genes.bed -c 43 44 45Report the entire, original alignment entry for each overlap with a gene. 46 47.. code-block:: bash 48 49 bedtools intersect -a reads.bed -b genes.bed -wa 50 51 52 53Report the entire, original gene entry for each overlap with a gene. 54 55.. code-block:: bash 56 57 bedtools intersect -a reads.bed -b genes.bed -wb 58 59 60 61Report the entire, original alignment and gene entries for each overlap. 62 63.. code-block:: bash 64 65 bedtools intersect -a reads.bed -b genes.bed -wa -wb 66 67 68 69Only report an overlap with a repeat if it spans at least 50% of the exon. 70 71.. code-block:: bash 72 73 bedtools intersect -a exons.bed -b repeatMasker.bed -f 0.50 74 75 76 77Only report an overlap if comprises 50% of the structural variant and 50% of the segmental duplication. Thus, it is reciprocally at least a 50% overlap. 78 79.. code-block:: bash 80 81 bedtools intersect -a SV.bed -b segmentalDups.bed -f 0.50 -r 82 83 84 85Read BED A from stdin. For example, find genes that overlap LINEs but not SINEs. 86 87.. code-block:: bash 88 89 bedtools intersect -a genes.bed -b LINES.bed | intersectBed -a stdin -b SINEs.bed -v 90 91 92 93Retain only single-end BAM alignments that overlap exons. 94 95.. code-block:: bash 96 97 bedtools intersect -abam reads.bam -b exons.bed > reads.touchingExons.bam 98 99 100 101Retain only single-end BAM alignments that do not overlap simple sequence 102repeats. 103 104.. code-block:: bash 105 106 bedtools intersect -abam reads.bam -b SSRs.bed -v > reads.noSSRs.bam 107 108 109 110 111========================================================================== 112bedtools bamtobed 113========================================================================== 114 115Convert BAM alignments to BED format. 116 117.. code-block:: bash 118 119 bedtools bamtobed -i reads.bam > reads.bed 120 121 122 123Convert BAM alignments to BED format using the BAM edit distance (NM) as the 124BED "score". 125 126.. code-block:: bash 127 128 bedtools bamtobed -i reads.bam -ed > reads.bed 129 130 131 132Convert BAM alignments to BEDPE format. 133 134.. code-block:: bash 135 136 bedtools bamtobed -i reads.bam -bedpe > reads.bedpe 137 138 139 140 141 142========================================================================== 143bedtools window 144========================================================================== 145 146 147 148Report all genes that are within 10000 bp upstream or downstream of CNVs. 149 150.. code-block:: bash 151 152 bedtools window -a CNVs.bed -b genes.bed -w 10000 153 154 155 156Report all genes that are within 10000 bp upstream or 5000 bp downstream of 157CNVs. 158 159.. code-block:: bash 160 161 bedtools window -a CNVs.bed -b genes.bed -l 10000 -r 5000 162 163 164Report all SNPs that are within 5000 bp upstream or 1000 bp downstream of genes. 165Define upstream and downstream based on strand. 166 167.. code-block:: bash 168 169 bedtools window -a genes.bed -b snps.bed -l 5000 -r 1000 -sw 170 171 172 173 174 175========================================================================== 176bedtools closest 177========================================================================== 178Note: By default, if there is a tie for closest, all ties will be reported. **closestBed** allows overlapping 179features to be the closest. 180 181 182 183Find the closest ALU to each gene. 184 185.. code-block:: bash 186 187 bedtools closest -a genes.bed -b ALUs.bed 188 189 190 191Find the closest ALU to each gene, choosing the first ALU in the file if there is a 192tie. 193 194.. code-block:: bash 195 196 bedtools closest -a genes.bed -b ALUs.bed -t first 197 198 199 200Find the closest ALU to each gene, choosing the last ALU in the file if there is a 201tie. 202 203.. code-block:: bash 204 205 bedtools closest -a genes.bed -b ALUs.bed -t last 206 207 208 209 210========================================================================== 211bedtools subtract 212========================================================================== 213 214.. note:: 215 216 If a feature in A is entirely "spanned" by any feature in B, it will not be reported. 217 218Remove introns from gene features. Exons will (should) be reported. 219 220.. code-block:: bash 221 222 bedtools subtract -a genes.bed -b introns.bed 223 224 225========================================================================== 226bedtools merge 227========================================================================== 228 229.. note:: 230 231 ``merge`` requires that the input is sorted by chromosome and then by start 232 coordinate. For example, for BED files, one would first sort the input 233 as follows: ``sort -k1,1 -k2,2n input.bed > input.sorted.bed`` 234 235Merge overlapping repetitive elements into a single entry. 236 237.. code-block:: bash 238 239 bedtools merge -i repeatMasker.bed 240 241 242 243Merge overlapping repetitive elements into a single entry, returning the number of 244entries merged. 245 246.. code-block:: bash 247 248 bedtools merge -i repeatMasker.bed -n 249 250 251Merge nearby (within 1000 bp) repetitive elements into a single entry. 252 253.. code-block:: bash 254 255 bedtools merge -i repeatMasker.bed -d 1000 256 257 258========================================================================== 259bedtools coverage 260========================================================================== 261 262 263Compute the coverage of aligned sequences on 10 kilobase "windows" spanning the 264genome. 265 266.. code-block:: bash 267 268 bedtools coverage -a reads.bed -b windows10kb.bed | head 269 chr1 0 10000 0 10000 0.00 270 chr1 10001 20000 33 10000 0.21 271 chr1 20001 30000 42 10000 0.29 272 chr1 30001 40000 71 10000 0.36 273 274 275 276Compute the coverage of aligned sequences on 10 kilobase "windows" spanning the 277genome and created a BEDGRAPH of the number of aligned reads in each window for 278display on the UCSC browser. 279 280.. code-block:: bash 281 282 bedtools coverage -a reads.bed -b windows10kb.bed | cut -f 1-4 > windows10kb.cov.bedg 283 284 285 286Compute the coverage of aligned sequences on 10 kilobase "windows" spanning the 287genome and created a BEDGRAPH of the fraction of each window covered by at least 288one aligned read for display on the UCSC browser. 289 290.. code-block:: bash 291 292 bedtools coverage -a reads.bed -b windows10kb.bed | \ 293 awk '{OFS="\t"; print $1,$2,$3,$6}' \ 294 > windows10kb.pctcov.bedg 295 296 297 298 299========================================================================== 300bedtools complement 301========================================================================== 302 303 304Report all intervals in the human genome that are not covered by repetitive 305elements. 306 307.. code-block:: bash 308 309 bedtools complement -i repeatMasker.bed -g hg18.genome 310 311 312 313========================================================================== 314bedtools shuffle 315========================================================================== 316 317 318Randomly place all discovered variants in the genome. However, prevent them 319from being placed in know genome gaps. 320 321.. code-block:: bash 322 323 bedtools shuffle -i variants.bed -g hg18.genome -excl genome_gaps.bed 324 325 326 327Randomly place all discovered variants in the genome. However, prevent them 328from being placed in know genome gaps and require that the variants be randomly 329placed on the same chromosome. 330 331.. code-block:: bash 332 333 bedtools shuffle -i variants.bed -g hg18.genome -excl genome_gaps.bed -chrom 334