[Bioperl-l] Read/write round-tripping Was: Re: New Bioperl dependency? Sort::Naturally

Florent Angly florent.angly at gmail.com
Sun May 9 07:26:19 UTC 2010


Chris,

I've thought some more on the problem and I now agree with you that 
round-tripping at the object-level is more powerful.

It has the problem that some objects are given IDs dynamically every 
time, which means that identical input files won't have an identical object.

> is_deeply( $obj_out , $obj_in , 'deep compare' );

> not ok 1 - deep compare
> #   Failed test 'deep compare'
> #   at ./test_roundtrip.pl line 33.
> #     Structures begin differing at:
> #     ${     $got->{_contigs}{Contig35}{_sfc}{_btree}} = '56438592'
> #     ${$expected->{_contigs}{Contig35}{_sfc}{_btree}} = '54980512'
> 1..1
> # Looks like you failed 1 test of 1.


And when I re-run this again:

> not ok 1 - deep compare
> #   Failed test 'deep compare'
> #   at ./test_roundtrip.pl line 33.
> #     Structures begin differing at:
> #     ${     $got->{_contigs}{Contig35}{_sfc}{_btree}} = '47763264'
> #     ${$expected->{_contigs}{Contig35}{_sfc}{_btree}} = '46305184'
> 1..1
> # Looks like you failed 1 test of 1.

Note how the value of _btree changes everytime.

Maybe using Test::Deep would be a good approach 
(http://search.cpan.org/~fdaly/Test-Deep-0.106/lib/Test/Deep.pod):
> Where it becomes more interesting is in allowing you to do something 
> besides simple exact comparisons. With strings, the |eq| operator 
> checks that 2 strings are exactly equal but sometimes that's not what 
> you want. When you don't know exactly what the string should be but 
> you do know some things about how it should look, |eq| is no good and 
> you must use pattern matching instead. Test::Deep provides pattern 
> matching for complex data structures

Florent




On 09/05/10 10:02, Chris Fields wrote:
> Should clarify that: round-tripping to generate the same data structure/object is good and what we want.  Round-tripping to generate the exact same output is not our highest priority.
>
> chris
>
> On May 8, 2010, at 6:47 PM, Chris Fields wrote:
>
>    
>> To tell the truth, I'm more worried about getting data from various formats into Bio::* objects than getting the output 100% correct and identical to the original input.  None of the SeqIO module make that specific promise, simply b/c it's a nearly impossible thing to maintain, with very little payback.  Round-tripping is fine and all, just not our first priority.
>>
>> chris
>>
>> On May 8, 2010, at 6:34 AM, Florent Angly wrote:
>>
>>      
>>> Same question about the CPAN module Test::Files (http://search.cpan.org/~philcrow/Test-Files-0.14/Files.pm<http://search.cpan.org/%7Ephilcrow/Test-Files-0.14/Files.pm>). I could see myself using it in the BioPerl unit tests to make sure that the assembly files written match the input assembly files.
>>>
>>> It looks like the Bio::SeqIO modules tests could use it as well.
>>>
>>> Cheers,
>>>
>>> Florent
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>        
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>      
>    




More information about the Bioperl-l mailing list