[Bioperl-l] Hierarchical location parsing

Mark Hoebeke Mark.Hoebeke at jouy.inra.fr
Thu Mar 24 23:52:27 EST 2005


Sorry I messed up the example I gave, but an "in-nature" hierarchical
location can be found in the complete genome of Streptococcus pyogenes
strain MGAS315 (Genbank access number AE014074) :


source  join(1..749107,join(788646..977266,join(1018339..1137553,
                     join(1171973..1230114,join(1271911..1313193,
                     join(1351400..1410541,1450556..1900521))))))

In this case, it seems likely that the joins could be flattened out.
However, when massively feeding Genbank entries into a database it could
be unpractical to re-parse location strings  to determine if 1/ they
contain nested joins and 2/ they can or cannot be flattened out.


I don't know to what extent the FTLocationFactory is tested when running
'make test' on a bioper-live tree, but it yields the same results on
both patched and unpatched trees.

Mark

Le jeudi 24 mars 2005 à 17:55 -0800, Jason Stajich a écrit :
> Is there a real example where these types of locations exist - why  
> can't it be flattened without the nested joins?  At any rate - I don't  
> really care to parse these if they never exist "in-nature".  If your  
> bugfix soln works and doesn't slow things down we can use it I guess,  
> although I prefer a regexp.  I don't really have time to patch or test  
> in the near future so it will have to wait for someone to volunteer to  
> get to it.
> 
> -jason
> --
> Jason Stajich
> jason.stajich at duke.edu
> http://www.duke.edu/~jes12/

--------------------------Mark.Hoebeke at jouy.inra.fr----------------------
Unité Statistique & Génome                                     Unité MIG
+33 (0)1 60 87 38 03                  Tél.          +33 (0)1 34 65 28 85
+33 (0)1 60 87 38 09                  Fax.          +33 (0)1 34 65 29 01
Tour Evry 2, 523 pl. des Terrasses             INRA - Domaine de Vilvert
F - 91000 Evry                             F - 78352 Jouy-en-Josas CEDEX

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Ceci est une partie de message
	=?ISO-8859-1?Q?num=E9riquement?= =?ISO-8859-1?Q?_sign=E9e?=
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20050325/f8dfb372/attachment.bin


More information about the Bioperl-l mailing list