1.. _bamtobed:
2
3###############
4*bamtobed*
5###############
6``bedtools bamtobed`` is a conversion utility that converts sequence alignments
7in BAM format into BED, BED12, and/or BEDPE records.
8
9==========================================================================
10Usage and option summary
11==========================================================================
12**Usage**:
13::
14
15  bedtools bamtobed [OPTIONS] -i <BAM>
16
17**(or)**:
18::
19
20    bamToBed [OPTIONS] -i <BAM>
21
22
23
24.. tabularcolumns:: |p{4.5cm}|p{8.5cm}|
25
26=============   ================================================================
27Option          Description
28=============   ================================================================
29**-bedpe**      Write BAM alignments in BEDPE format. Only one alignment from
30                paired-end reads will be reported. Specifically, it each mate
31                is aligned to the same chromosome, the BAM alignment reported
32                will be the one where the BAM insert size is greater than zero.
33                When the mate alignments are interchromosomal, the
34                lexicographically lower chromosome will be reported first.
35                Lastly, when an end is unmapped, the chromosome and strand will
36                be set to "." and the start and end coordinates will be set
37                to -1. *By default, this is disabled and the output will be
38                reported in BED format*.
39**-mate1**      When writing BEDPE (-bedpe) format,
40                always report mate one as the first BEDPE "block".
41**-bed12**      Write "blocked" BED (a.k.a. BED12) format. This will convert
42                "spliced" BAM alignments (denoted by the "N" CIGAR operation)
43                to BED12. `Forces -split`.
44**-split**      Report each portion of a "split" BAM (i.e., having an "N" CIGAR
45                operation) alignment as a distinct BED intervals.
46**-splitD**     Report each portion of a "split" BAM while obeying both "N" CIGAR
47                and "D" operation. Forces `-split`.
48**-ed**         Use the "edit distance" tag (NM) for the BED score field.
49                Default for BED is to use mapping quality. Default for BEDPE is
50                to use the *minimum* of the two mapping qualities for the pair.
51                When -ed is used with -bedpe, the total edit distance from the
52                two mates is reported.
53**-tag**        Use other *numeric* BAM alignment tag for BED score. Default
54                for BED is to use mapping quality. Disallowed with BEDPE output.
55**-color**      An R,G,B string for the color used with BED12 format. Default
56                is (255,0,0).
57**-cigar**      Add the CIGAR string to the BED entry as a 7th column.
58=============   ================================================================
59
60
61==========================================================================
62Default behavior
63==========================================================================
64By default, each alignment in the BAM file is converted to a 6 column BED. The
65BED "name" field is comprised of the RNAME field in the BAM alignment. If mate
66information is available, the mate (e.g., "/1" or "/2") field will be appended
67to the name.
68
69.. code-block:: bash
70
71  $ bedtools bamtobed -i reads.bam | head -3
72  chr7   118970079   118970129   TUPAC_0001:3:1:0:1452#0/1   37   -
73  chr7   118965072   118965122   TUPAC_0001:3:1:0:1452#0/2   37   +
74  chr11  46769934    46769984    TUPAC_0001:3:1:0:1472#0/1   37   -
75
76
77==========================================================================
78``-tag`` Set the score field based on BAM tags
79==========================================================================
80One can override the choice of the BAM `MAPQ` as the result BED record's `score`
81field by using the ``-tag`` option.  In the example below, we use the ``-tag``
82option to select the BAM edit distance (the `NM` tag) as the score
83column in the resulting BED records.
84
85.. code-block:: bash
86
87  $ bedtools bamtobed -i reads.bam -tag NM | head -3
88  chr7   118970079   118970129   TUPAC_0001:3:1:0:1452#0/1   1    -
89  chr7   118965072   118965122   TUPAC_0001:3:1:0:1452#0/2   3    +
90  chr11  46769934    46769984    TUPAC_0001:3:1:0:1472#0/1   1    -
91
92
93==========================================================================
94``-bedpe`` Set the score field based on BAM tags
95==========================================================================
96The ``-bedpe`` option converts BAM alignments to BEDPE format, thus allowing
97the two ends of a paired-end alignment to be reported on a single text line.
98Specifically, it each mate is aligned to the same chromosome,
99the BAM alignment reported will be the one where the BAM insert size is greater
100than zero. When the mate alignments are interchromosomal, the lexicographically
101lower chromosome will be reported first. Lastly, when an end is unmapped, the
102chromosome and strand will be set to "." and the start and end coordinates will
103be set to -1.
104
105.. note::
106
107    When using this option, it is required that the BAM
108    file is sorted/grouped by the read name. This allows bamToBed
109    to extract correct alignment coordinates for each end based on
110    their respective CIGAR strings. It also assumes that the
111    alignments for a given pair come in groups of twos. There is
112    not yet a standard method for reporting multiple alignments
113    using BAM. bamToBed will fail if an aligner does not report
114    alignments in pairs.
115
116.. code-block:: bash
117
118  $ bedtools bamtobed -i reads.ba -bedpe | head -3
119  chr7   118965072   118965122   chr7   118970079   118970129 TUPAC_0001:3:1:0:1452#0 37     +     -
120  chr11  46765606    46765656    chr11  46769934    46769984 TUPAC_0001:3:1:0:1472#0 37     +     -
121  chr20  54704674    54704724    chr20  54708987    54709037 TUPAC_0001:3:1:1:1833#0 37     +
122
123
124One can easily use samtools and bamToBed together as part of a UNIX pipe. In
125this example, we will only convert properly-paired (``FLAG == 0x2``) reads to
126BED format.
127
128.. code-block:: bash
129
130  $ samtools view -bf 0x2 reads.bam | bedtools bamtobed -i stdin | head
131  chr7   118970079   118970129   TUPAC_0001:3:1:0:1452#0/1   37   -
132  chr7   118965072   118965122   TUPAC_0001:3:1:0:1452#0/2   37   +
133  chr11  46769934    46769984    TUPAC_0001:3:1:0:1472#0/1   37   -
134  chr11  46765606    46765656    TUPAC_0001:3:1:0:1472#0/2   37   +
135  chr20  54704674    54704724    TUPAC_0001:3:1:1:1833#0/1   37   +
136  chr20  54708987    54709037    TUPAC_0001:3:1:1:1833#0/2   37   -
137  chrX   9380413     9380463     TUPAC_0001:3:1:1:285#0/1    0    -
138  chrX   9375861     9375911     TUPAC_0001:3:1:1:285#0/2    0    +
139  chrX   131756978   131757028   TUPAC_0001:3:1:2:523#0/1    37   +
140  chrX   131761790   131761840   TUPAC_0001:3:1:2:523#0/2    37   -
141
142
143==================================================================
144``-split`` Creating BED12 features from "spliced" BAM entries.
145==================================================================
146``bedtools bamtobed`` will, by default, create a BED6 feature that represents
147the entire span of a spliced/split BAM alignment. However, when using the
148``-split`` command, a BED12 feature is reported where BED blocks will be
149created for each aligned portion of the sequencing read.
150
151::
152
153  Chromosome  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
154
155  Exons       ***************                                    **********
156
157  BED/BAM A      ^^^^^^^^^^^^....................................^^^^
158
159  Result      ===============                                    ====
160
161