[Bioperl-l] Porting Entrez Gene parser to Biojava, Biopython,
Biophp, even C++
Mingyi Liu
mingyi.liu at gpc-biotech.com
Sun Mar 13 22:26:35 EST 2005
Mingyi Liu wrote
> This has nothing to do with ASN. It is all about how uniform the data
> structure could be. In fact, consider when NCBI decides to do
> {
> tag id 12345,
> tag str "whatever"
> }
oops, I really meant:
{
tag id 12345,
tag str "whatever",
tag id 34567
}
I switched to str just as example but forgot that this renders my
example incorrect. So now the structure has to become:
'tag' => [
{
'id' => '12345',
'str' => 'whatever'
}
{
'id' => 34567
}
]
or one that makes more sense
'tag' => [
{
'id' => '12345'
}
{
'str' => 'whatever'
}
{
'id' => 34567
}
]
which is my approach. Again your approach would demand users to test
reference before dealing with content, and users have to design two ways
of dealing with the content. While in my approach users always deal
with it as array, just one design and no reference testing needed. If
you read my comment for the data structure trimming function, you'll see
some more consideration in this aspect. It's still not perfect, I hope
that's not too surprising and not becoming a reason to dispatch my
parser altogether. ;-)
Regards,
Mingyi
More information about the Bioperl-l
mailing list