[Bioperl-l] How to merge mulitple genbank records into one record

Haiming Wang hwang at uga.edu
Mon Apr 24 20:50:10 UTC 2006


The locations of the features refer to the individual 1000000 bp 
sub-sequences. For example, in the second genbank record 
'scaffold:FUGU4:scaffold_1:1000001:2000000:1', the location of a gene is 
1760..4580. It is supposed to be 1001760..1004580 on the chromosome.

Thanks
-Haiming


Brian Osborne wrote:
> Haiming,
>
> Do the locations of the features refer to the individual 1000000 bp
> sub-sequences or are they actually locations on the merged sequence, the
> "chromosome"?
>
> Brian O.
>
>
> On 4/24/06 3:02 PM, "Haiming Wang" <hwang at uga.edu> wrote:
>
>   
>> Hi,
>>
>> I am wondering if there is a script or tool can merge several genbank
>> records into one record with all features' coordinates updated
>> accordingly. For example, I have multiple Fugu scaffold_1 genbank files
>> which are arbitrarily cut by 1000000 bps. I'd like to merge them into
>> one big scaffold_1 genbank file.
>>
>> Thanks in advance!
>>
>> -Haiming
>>
>> p.s. example data
>> genbank record 1:
>> LOCUS   scaffold_1 1000000 bp DNA HTG 8-FEB-2006
>> DEFINITION  Fugu rubripes scaffold scaffold_1 FUGU4 partial sequence
>> 1..1000000  reannotated via EnsEMBL
>> ACCESSION   scaffold:FUGU4:scaffold_1:1:1000000:1
>> ......
>> //
>>
>> genbank record 2:
>> LOCUS  scaffold_1 1000000 bp DNA HTG 8-FEB-2006
>> DEFINITION  Fugu rubripes scaffold scaffold_1 FUGU4 partial
>> sequence1000001..2000000 reannotated via EnsEMBL
>> ACCESSION   scaffold:FUGU4:scaffold_1:1000001:2000000:1
>> ......
>> //
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
>
>   




More information about the Bioperl-l mailing list