[Bioperl-l] Fwd: Q: batched extraction of sub-sequences and their reverse-complements ?

Dave Messina David.Messina at sbc.su.se
Tue Apr 12 15:23:39 UTC 2011

---------- Forwarded message ----------
From: wadim kapulkin <wadim_kapulkin at yahoo.co.uk>
Date: Tue, Apr 12, 2011 at 17:13
Subject: Re: [Bioperl-l] Q: batched extraction of sub-sequences and their
reverse-complements ?
To: Dave Messina <David.Messina at sbc.su.se>

Hello Dave

Thank you very much for yours response. Indeed my question might be split as
you did :)

So first:
Yours suggestion below as to use Bio::DB::Fasta shall make trick. Thanks
very much !

As per second part : I probably did not explained properly what I had in
mind. However the link you included below seems to address this matter:
quoting exerted phrase 'Although coordinate conversion sounds pretty trivial
it can get fairly tricky when one includes the possibilities of switching to
coordinates on negative (i.e. Crick) strands and/or having a coordinate
system terminate because you have reached the end of a clone or contig.'. The
issue is indeed in the coordinate conversion. In the specific example, I
have been concerned with: I used Cbriggsae chromosomal set to run external
program and find out the output depends sometimes on strand polarity...
(this is getting even more complicated when used other assemblies/ db
freezes offering the sequences differing in lenght). I will need bit more
time to describe this specific example.

Thanks very much again.


*From:* Dave Messina <David.Messina at sbc.su.se>
*To:* wadim kapulkin <wadim_kapulkin at yahoo.co.uk>
*Cc:* bioperl-l at lists.open-bio.org
*Sent:* Sat, 9 April, 2011 4:47:34
*Subject:* Re: [Bioperl-l] Q: batched extraction of sub-sequences and their
reverse-complements ?

Hi Wadim,

I would like to extract the batch of subsequences (as fastas),  based on
> list of
> coordinates : i.e. 1-1000, 1001-2000 , 2001-3000 etc) from given 'large
> seqence'
> (i.e. chromosome sized >10MB)

Take a look at Bio::DB::Fasta.

> and then, ideally , I would be keen to know how to
> extract the converse set - [i.e.: extract 'same' ( I mean corresponding)
> batch
> of sequences, based on list of converse coordinates  from
> reverse-complement of
> given 'large sequence'].

I don't totally understand this part of your question, but this may help:


Bioperl-l mailing list
Bioperl-l at lists.open-bio.org

More information about the Bioperl-l mailing list