1.. _bamtofastq: 2 3############### 4*bamtofastq* 5############### 6``bedtools bamtofastq`` is a conversion utility for extracting FASTQ records 7from sequence alignments in BAM format. 8 9 10.. note:: 11 12 If you are using CRAM as input, you will need to specify 13 the *full path* describing the location of the relevant reference genome in FASTA format via the CRAM_REFERENCE environment variable. For example: 14 15 `export CRAM_REFERENCE=/path/to/ref/g1k_v37_decoy.fa` 16 17 18========================================================================== 19Usage and option summary 20========================================================================== 21**Usage**: 22:: 23 24 bedtools bamtofastq [OPTIONS] -i <BAM> -fq <FASTQ> 25 26**(or)**: 27:: 28 29 bamToFastq [OPTIONS] -i <BAM> -fq <FASTQ> 30 31 32 33.. tabularcolumns:: |p{4.5cm}|p{8.5cm}| 34 35============= ================================================================ 36Option Description 37============= ================================================================ 38**-fq2** FASTQ for second end. Used if BAM contains paired-end data. 39 BAM should be sorted by query name 40 (``samtools sort -n aln.bam aln.qsort``) if creating 41 paired FASTQ with this option. 42**-tags** Create FASTQ based on the mate info in the BAM R2 and Q2 tags. 43============= ================================================================ 44 45 46========================================================================== 47Default behavior 48========================================================================== 49By default, each alignment in the BAM file is converted to a FASTQ record 50in the ``-fq`` file. The order of the records in the resulting FASTQ exactly 51follows the order of the records in the BAM input file. 52 53.. code-block:: bash 54 55 $ bedtools bamtofastq -i NA18152.bam -fq NA18152.fq 56 57 $ head -8 NA18152.fq 58 @NA18152-SRR007381.35051 59 GGAGACATATCATATAAGTAATGCTAGGGTGAGTGGTAGGAAGTTTTTTCATAGGAGGTGTATGAGTTGGTCGTAGCGGAATCGGGGGTATGCTGTTCGAATTCATAAGAACAGGGAGGTTAGAAGTAGGGTCTTGGTGACAAAATATGTTGTATAGAGTTCAGGGGAGAGTGCGTCATATGTTGTTCCTAGGAAGATTGTAGTGGTGAGGGTGTTTATTATAATAATGTTTGTGTATTCGGCTATGAAGAATAGGGCGAAGGGGCCTGCGGCGTATTCGATGTTGAAGCCTGAGACTAGTTCGGACTCCCCTTCGGCAAGGTCGAA 60 + 61 <<<;;<;<;;<;;;;;;;;;;;;<<<:;;;;;;;;;;;;;;;;::::::;;;;<<;;;;;;;;;;;;;;;;;;;;;;;;;;;;<<<<<;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;<<;;;;;:;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;<<<;;;;;;;;;;<<<<<<<<;;;;;;;;;:;;;;;;;;;;;;;;;;;;;:;;;;8;;8888;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;8966689666666299866669:899 62 @NA18152-SRR007381.637219 63 AATGCTAGGGTGAGTGGTAGGAAGTTTTTTCATAGGAGGTGTATGAGTTGGTCGTAGCGGAATCGGGGGTATGCTGTTCGAATTCATAAGAACAGGGAGGTTAGAAGTAGGGTCTTGGTGACAAAATATGTTGTATAGAGTTCAGGGGAGAGTGCGTCATATGTTGTTCCTAGGAAGATTGTAGTGGTGAGGGTGTTTATTATAATAATGTTTGTGTATTCGGCTATGAAGAATAGGGCGAAGGGGCCTGCGGCGTATTCGATGTTGAAGCCTGAGACTAGTTCGGACTCCCCTTCCGGCAAGGTCGAA 64 + 65 <<<<<<<<<<;;<;<;;;;<<;<888888899<;;;;;;<;;;;;;;;;;;;;;;;;;;;;;;;<<<<<;;;;;;;;;<;<<<<<;;;;;;;;;;;;;<<<<;;;;;;;:::;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;<<<<;;;;;;;;;;;;;;;;;;;;;;;<;;;;;;;;;;;;;;;;;;;;;;<888<;<<;;;;<<<<<<;;;;;<<<<<<<<;;;;;;;;;:;;;;888888899:::;;8;;;;;;;;;;;;;;;;;;;99;;99666896666966666600;96666669966 66 67 68 69========================================================================== 70``-fq2`` Creating two FASTQ files for paired-end sequences. 71========================================================================== 72If your BAM alignments are from paired-end sequence data, one can use the 73``-fq2`` option to create two distinct FASTQ output files --- one for 74end 1 and one for end 2. 75 76.. note:: 77 78 When using this option, it is required that the BAM 79 file is sorted/grouped by the read name. This keeps the resulting records 80 in the two output FASTQ files in the same order. One can sort the BAM 81 file by query name with ``samtools sort -n aln.bam aln.qsort``. 82 83 84.. code-block:: bash 85 86 $ samtools sort -n aln.bam aln.qsort 87 88 $ bedtools bamtofastq -i aln.qsort.bam \ 89 -fq aln.end1.fq \ 90 -fq2 aln.end2.fq 91 92 $ head -8 aln.end1.fq 93 @SRR069529.2276/1 94 CAGGGAGAAGGAGGTAGGAAAGAGAAAGGACCAGGGAGGGGCGCATACACAGGACGCTCCGTGCGGTGATAGCAGCACCACACTGTGTTCAGTCGTCTGGC 95 + 96 =;@>==############################################################################################### 97 @SRR069529.2406/1 98 GCTGGGAAAAGGATTCAGGATGTTGGTTTCTATCTTTGAGTTGCTGCTGTGCGGCTGTCCCTACACTCGCAGTACCCCTCGGACACCGTCTACTGTGGAGG 99 + 100 =5@><<:?<? 101 102 $ head -8 aln.end2.fq 103 @SRR069529.2276/2 104 AGACCCAGAGAGGGACAGGATCTGTCCCAGATCATAAAATAGGGGGAGTGCTCCGTAGAGGCGTGCGCGGTGGCACCGTGCAGTAGTACGGGTGAGCGGGG 105 + 106 ##################################################################################################### 107 @SRR069529.2406/2 108 TTCCCTACCCCTGGGGTCAGGGACTACAGCCAAGGGGAGAACTTTAGCAAGTAGACGTTAGTTATTTTGATTCCAGTGGGGACGCGCGTGTAGCGAGTTGT 109 + 110 @>=AABB?AAACABBA>@?AAAA>B@@AB@AA:B@AA@??############################################################# 111 112 113