[Biopython] Read Groups for BWA

Mic mictadlo at gmail.com
Mon Feb 18 08:19:04 UTC 2019


Hi all,
In order to determine the Read Groups from FASTQ files for BWA I used to do:

*#Get read group infomration:*
*#Source: https://www.biostars.org/p/280837/#310132
<https://www.biostars.org/p/280837/#310132>*
*header=$(zcat $r1 | head -n 1)*
*id=$(echo $header | head -n 1 | cut -f 1-4 -d":" | sed 's/@//' | sed
's/:/_/g')*
*sm=$(echo $header | head -n 1 | grep -Eo "[ATGCN]+$")*
*echo "Read Group @RG\tID:$id\tSM:$id"_"$sm\tLB:$id"_"$sm\tPL:ILLUMINA"*
*...*
*bwa mem \*
*$2 $r1 $r2 \*
*-t 12 \*
*-R "$(echo "@RG\tID:$id\tSM:$id"_"$sm\tLB:$id"_"$sm\tPL:ILLUMINA")" |
samblaster -r | samtools view -@ 12 -bSh -f 0x2 -F 2316 - | samtools
fixmate - - | samtools sort -@ 12 - -o ${3}/${output}.sorted.dedup.bam*

I just wonder whether BIopython has a function to determine the Read Groups?

Thank you in advance,

Best wishes,

Michal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20190218/cb01de3f/attachment.html>


More information about the Biopython mailing list