[Biojava-l] RichSequence annotations...

Richard Holland richard.holland at ebi.ac.uk
Fri Mar 24 13:16:49 UTC 2006


The terms are ranked in RichAnnotations. getProperty(term) searches for
a Note with that term and a rank of zero.

If you don't know the ranks, you need to use the 

    public Note[] getProperties(Object key);

method on the RichAnnotation object instead. This will return a list of
all matching Note objects with the given term regardless of rank.

cheers,
Richard

On Fri, 2006-03-24 at 11:26 +0000, Jolyon Holdstock wrote:
> Hi,
> 
>  
> 
> I use the following code to extract all the genes from a sequence file; 
> 
> I load the sequence then filter out only CDS features; iterating through
> these lets me get the gene annotation for the feature
> 
>  
> 
> //======================================================================
> =========
> 
> Sequence seq;
> 
> String fileName = new
> File("C:/Scripts/Java/BioJava/BioJavaX/biojava-live/demos/seq/AL121903.e
> mbl");
> 
> try {
> 
>   seq = SeqIOTools.readEmbl(new BufferedReader(new
> FileReader(fileName))).nextSequence();
> 
> }
> 
> catch (IOException IOE) {
> 
>   System.out.println("IOException " + IOE);
> 
> }
> 
> catch (BioException BIOE) {
> 
>   System.out.println("BioException " + BIOE);
> 
> }
> 
>     
> 
> //Create a feature filter for CDS features only
> 
> FeatureFilter ff = new FeatureFilter.ByType("CDS");
> 
>  
> 
> //Get the filtered Features
> 
> FeatureHolder fh = seq.filter(ff);
> 
>  
> 
> //Iterate over the Features in fh
> 
> for (Iterator i = fh.features(); i.hasNext(); ) {
> 
>   Feature f = (Feature)i.next();
> 
>   Annotation annotation = f.getAnnotation();
> 
>   Object key = "gene";
> 
>   hash.put(annotation.getProperty(key), f);
> 
> }
> 
> //======================================================================
> =========
> 
>  
> 
> I am now using the new BioJavaX classes which I cannot get to work. Does
> anyone has any pointers for this?
> 
> I use the sequence data so have to use a RichSequence rather than a
> BioEntry
> 
>  
> 
> //======================================================================
> =========
> 
> RichSequence richSeq;
> 
> String fileName =
> "C:/Scripts/Java/BioJava/BioJavaX/biojava-live/demos/seq/AL121903.embl";
> 
>   try {
> 
>     richSeq = RichSequence.IOTools.readEMBLDNA(new BufferedReader(new
> FileReader(fileName)), null).nextRichSequence();
> 
>   }
> 
>   catch (IOException IOE) {
> 
>     System.out.println("IOException " + IOE);
> 
>   }
> 
>   catch (BioException BIOE) {
> 
>     System.out.println("BioException " + BIOE);
> 
> }
> 
>  
> 
> //Create a feature filter for CDS features only
> 
> FeatureFilter ff = new FeatureFilter.ByType("CDS");
> 
>  
> 
> //Get the filtered Features
> 
> FeatureHolder fh = richSeq.filter(ff);
> 
>  
> 
> //Iterate through the features
> 
> for (Iterator i = fh.features(); i.hasNext(); ) {
> 
>   RichFeature rf = (RichFeature) i.next();
> 
>   System.out.println("RichFeature: " + rf.toString());
> 
>   RichAnnotation ra = (RichAnnotation) rf.getAnnotation();
> 
>   System.out.println("RichAnnotation: " + ra.toString());
> 
> }
> 
> //======================================================================
> =========
> 
>  
> 
> The output  shows that CDS features have been filtered successfully and
> that the gene name is in the annotation
> 
>  
> 
> RichFeature: (#1)
> lcl:HSDJ155G6/AL121903.13:CDS,EMBL(biojavax:join:[<5642..5793,10804..109
> 76,12496..12656,14136..14266])
> 
> RichAnnotation: [(#2) biojavax:clone_lib: RPCI-1"
> 
> 14403..14532,16852..16987,17821..17959,18068..18122,
> 
> 19456..19570,23623..23753,25885..26053,29102..29240,
> 
> 32621..32738,33595..33771],[(#3) biojavax:codon_start: 1],[(#4)
> biojavax:evidence: NOT_EXPERIMENTAL],[(#5) biojavax:note: match:
> proteins: Tr:Q9Y6D5 Tr:O46382 Tr:Q9Y6D6],[(#6) biojavax:gene:
> dJ155G6.1],[(#7) biojavax:product: dJ155G6.1 (brefeldin A-inhibited
> guanine
> 
> nucleotide-exchange protein 2)],[(#8) biojavax:protein_id: CAB86643.1]
> 
>  
> 
> 
> 
> If I add the following then I can see what keys are in the annotation
> 
> //======================================================================
> =========
> 
> Set keySet = ra.keys();
> 
> for (Iterator it = keySet.iterator(); it.hasNext(); ) {
> 
>   String key = it.next().toString();
> 
>   System.out.println("Key: " + key);
> 
> }
> 
> //======================================================================
> =========
> 
>  
> 
> The output shows that there is a gene
> 
>  
> 
> Key: biojavax:clone_lib
> 
> Key: biojavax:codon_start
> 
> Key: biojavax:evidence
> 
> Key: biojavax:gene
> 
> Key: biojavax:note
> 
> Key: biojavax:product
> 
> Key: biojavax:protein_id
> 
>  
> 
> My understanding is that I need to use a ComparableTerm to access the
> value but when I create it I get a NoSuchElementException error
> 
>  
> 
> ComparableTerm gene =
> RichObjectFactory.getDefaultOntology().getOrCreateTerm("gene");
> 
> System.out.println("Gene: " + ra.getProperty(gene));
> 
>  
> 
> java.util.NoSuchElementException: No such property: biojavax:gene, rank
> 0
> 
>  
> 
> cheers,
> 
>  
> 
> Jolyon
> 
>  
> 
> 
> 
> 
> 
> 
> 
> Jolyon Holdstock Ph.D.
> 
> Senior Computational Biologist,
> 
> Oxford Gene Technology (Ops) Ltd.
> 
> Begbroke Business and Science Park
> 
> Sandy Lane, Yarnton
> 
> Oxford, OX5 1PF
> 
>  
> 
> Tel: 01865 309699
> 
> Fax: 01865 842116
> 
>  
> 
> Confidentiality Notice:
> 
> The contents of this email from the Oxford Gene Technology Group of
> Companies are confidential and intended solely for the person to whom it
> is addressed. It may contain privileged and confidential information. If
> you are not the intended recipient you must not read, copy, distribute,
> discuss or take any action in reliance on it.
> 
>  
> 
> 
> 
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
-- 
Richard Holland
European Bioinformatics Institute
Wellcome Trust Genome Campus, Hinxton
Cambridge CB10 1SD, UK
Tel: +44-(0)1223-494416
---------------




More information about the Biojava-l mailing list