[Bioperl-l] Fwd: Re: How to Obtain Nucleotide Sequence from SeqIO::fastq

Roy Chaudhuri roy.chaudhuri at gmail.com
Tue Jul 27 15:35:49 UTC 2010


Forgot to cc the list:

-------- Original Message --------
Subject: Re: [Bioperl-l] How to Obtain Nucleotide Sequence from SeqIO::fastq
Date: Tue, 27 Jul 2010 16:03:53 +0100
From: Roy Chaudhuri <roy.chaudhuri at gmail.com>
To: Alan Twaddle <twaddlac at gmail.com>

That doesn't look like Fastq format as defined here:
http://nar.oxfordjournals.org/cgi/content/full/gkp1137v1
http://en.wikipedia.org/wiki/FASTQ_format

It shouldn't be difficult to write a script to parse it or convert to
standard Fastq, though.

  > p.s. how do I find out what version of bioperl I'm using?
http://www.bioperl.org/wiki/FAQ#How_can_I_tell_what_version_of_BioPerl_is_installed.3F


On 27/07/2010 15:55, Alan Twaddle wrote:
> I'm not certain that I'm using the latest BioPerl, but I can check. In
> the mean time, I'll send you the example data!
>
> @FQ4HLCS02EO12Q region=2 tag=H
> +GCGAAGAACCTTACCTACTCTTGACATCCAGAGAATTCGCTAGAGATAGCTTAGTGCCTTCGGGAACTCTGAGACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTTGTGAAATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTATCCTTTGTTGCCAGCGAGTAATGTCGGGAACTCAAAGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAACTCATCATGCCCCTTGCTGATAC
> +IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHHHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIID666IIIIHHHIIIIIIHHHIIIIIIIIIIDCCHHHIIIIIIGGGIIIIIIIIIIIHHHIIIHH???HHIII???IIIIIIIIIIIIIIIIHHHIIIIIIIIIIIICC@@HIIIIIIIIIIGEGGGGG?4444CCIIIGGGII
> @FQ4HLCS02EO12T region=2 tag=B
> +GATGAATTGACGTCATCCCCACCTTCCTCCGGTTTATTACCGGCAGTCTCGCTAGAGTGCCCAACTAAATGATGGCAACTAACAATAGGGGTTGCGCTCGTTGCGGGACTTAACCCAACATTTCACAACACGAGCTGACGACAGCCATGCACCACCTGTCACTTTGTCCCCGAAGGGAACTTCTATCTCTAGAAGGGTCAAAGGATGTCAAGATTTGGTAAGGTTCTTCGCGTTGCAATTCGATGTCGAGC
> +DDDGDGGGIIIIIIII@@@@IIIIIIIIIIIIHHB985DFI<<;==DDDGDBDADBG=644466AGB==GDEEFHHEEEHHHHHHHH====GE=D<<<;DBGED;;;;:9955000247<;;;;DGGGGHHHHHBBBHGGGHGGBBB@@@@>;;666EGGEEBBDD<97550//4--,,.62426468=ADD>>6666BBDDEEAEGEG;;;B>@@B;266;;GGDBA?:::995/////9>>9989=9:5
>
>
> On Tue, Jul 27, 2010 at 10:50 AM, Roy Chaudhuri<roy.chaudhuri at gmail.com>  wrote:
>> Hi Alan,
>>
>> It sounds like there's a problem with reading in your Fastq file. Are you
>> using the latest BioPerl version? There have been many bug fixes to
>> Bio::SeqIO::fastq over the last couple of years. If you are using an
>> up-to-date BioPerl, please could you send an example Fastq entry which gives
>> the error messages?
>>
>> Roy.
>>
>> On 27/07/2010 15:37, Alan Twaddle wrote:
>>>
>>> Whenever I try to access the sequence I get the following error message:
>>>
>>>
>>> --------------------- WARNING ---------------------
>>> MSG: Seq/Qual descriptions don't match; using sequence description
>>>
>>> ---------------------------------------------------
>>>
>>> --------------------- WARNING ---------------------
>>> MSG: Fastq sequence/quality data length mismatch error
>>>
>>> ---------------------------------------------------
>>>
>>> --------------------- WARNING ---------------------
>>> MSG: seq doesn't validate with [0-9A-Za-z\*\-\.=~\\/\?], mismatch is +
>>> ---------------------------------------------------
>>>
>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: Attempting to set the sequence to
>>>
>>> [+GCGAAGAACCTTACCTACTCTTGACATCCAGAGAATTCGCTAGAGATAGCTTAGTGCCTTCGGGAACTCTGAGACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTTGTGAAATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTATCCTTTGTTGCCAGCGAGTAATGTCGGGAACTCAAAGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAACTCATCATGCCCCTTGCTGATAC]
>>> which does not look healthy
>>> STACK: Error::throw
>>> STACK: Bio::Root::Root::throw
>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:357
>>> STACK: Bio::PrimarySeq::seq
>>> /usr/lib/perl5/site_perl/5.8.8/Bio/PrimarySeq.pm:270
>>> STACK: Bio::PrimarySeq::new
>>> /usr/lib/perl5/site_perl/5.8.8/Bio/PrimarySeq.pm:221
>>> STACK: Bio::LocatableSeq::new
>>> /usr/lib/perl5/site_perl/5.8.8/Bio/LocatableSeq.pm:109
>>> STACK: Bio::Seq::Meta::Array::new
>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Seq/Meta/Array.pm:167
>>> STACK: Bio::Seq::Quality::new
>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Seq/Quality.pm:191
>>> STACK: Bio::Seq::SeqFactory::create
>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Seq/SeqFactory.pm:116
>>> STACK: Bio::SeqIO::fastq::next_seq
>>> /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/fastq.pm:151
>>> STACK: fastQParser.pl:12
>>>
>>>
>>>
>>> On Tue, Jul 27, 2010 at 10:16 AM, Roy Chaudhuri<roy.chaudhuri at gmail.com>
>>>   wrote:
>>>>
>>>> Hi Alan,
>>>>
>>>> Another case for the deobfuscator:
>>>> http://bioperl.org/cgi-bin/deob_interface.cgi
>>>>
>>>> A Bio::Seq::Quality object is a Bio::PrimarySeq, so you can just say
>>>> $seqqual->seq to get the sequence as a string.
>>>>
>>>> Cheers.
>>>> Roy.
>>>>
>>>> On 27/07/2010 15:10, Alan Twaddle wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>>       I am curious as to how I am supposed to use SeqIO::fastq to read
>>>>> in a fastq file and then obtain the nucleotide sequence from that. I
>>>>> noticed that SeqIO::fastq returns a Seq::Quality object but I haven't
>>>>> seen a method within that module that returns the nucleotide sequence.
>>>>> Please, let me know if you have any suggestions!
>>>>>
>>>>> Thank you very much!!
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>
>




More information about the Bioperl-l mailing list