[Biojava-dev] Biojava - svn migration was : bioperl like blastparser

Andreas Prlic ap3 at sanger.ac.uk
Wed Dec 26 23:29:56 UTC 2007


> You just need to put the repositor(ies) in
> /home/svn-repositories/biojava

Thanks for the info.  I now have the new biojava svn repository for  
developers running and it is possible to check out (and do commits) via

svn co svn+ssh://dev.open-bio.org/home/svn-repositories/biojava/ 
biojava-svn/biojava-live/trunk/  ./biojava-svn

I am just running final tests to see if all is fine. Access should  
work for other biojava developers as well.


For the anonymous access - who will set this up? I assume there will  
be a commit hook in the developers repository which will do a svnsync  
with the anonymous repository?

Andreas





>
> anyone in the biojava group can write there.
> you'll want to delete the existing biojava-live that is in there.
>
> I'm traveling most of 26th and will be on vacation most of the  
> week, but will check in when I have a chance.
>
> -jason
>
> On Dec 25, 2007, at 3:42 PM, Andreas Prlic wrote:
>
>> Hi Mark,
>>
>> Unfortunately the biojava svn respository is not ready yet.
>>
>> George has converted our CVS to an initial svn dump, which I  
>> tested and fixed some details.
>> This dump has been ready since dezember 17th. - ( see dev.open- 
>> bio.org:~andreas/biojava-final.svndump.bz2 )
>> The next step is to load this into the public open-bio repository,  
>> after which (and some more testing)  the new biojava repository  
>> would be ready for new commits.
>>
>> At the present I am waiting for somebody who has admin rights on  
>> the open-bio servers to do these final steps.
>> (or to delegate and give permissions to somebody else).
>>
>> I tried to contact support at open-bio, root-l, as well as mailing  
>> several people directly,
>> but so far I did not get a response.  could be that the holiday  
>> season is slowing response times down...
>>
>> Andreas
>>
>>
>>
>> On 25 Dec 2007, at 21:44, Mark Schreiber wrote:
>>
>>> Hi -
>>>
>>> When will the subversion system be ready for checkin?
>>>
>>> - Mark
>>>
>>> On Dec 24, 2007 4:29 PM, Michael Gang <michaelgang at gmail.com> wrote:
>>>> OK,
>>>> I made four changes,
>>>> in the package  org.biojava.bio.program.sax; at class  
>>>> BlastSaxParser
>>>> 1)  at line 86 i added the variable
>>>> private String                                            
>>>> oQueryLength;
>>>> 2) at the method private void interpret(String poLine) throws  
>>>> SAXException
>>>> in the if "if (iState == IN_HEADER) {"
>>>> at line 209 i added
>>>>
>>>> if (poLine.startsWith("(", 9) && poLine.endsWith("letters)") ) {
>>>>                 StringTokenizer st = new StringTokenizer(poLine);
>>>>                 oQueryLength = st.nextToken().substring(1);
>>>>            }
>>>> 3)at the function private void emitHeaderIds() throws  
>>>> SAXException {
>>>> at line 564 i added
>>>>  oAttQName.setQName("queryLength");
>>>>        oAtts.addAttribute(oAttQName.getURI(),
>>>>                           oAttQName.getLocalName(),
>>>>                           oAttQName.getQName(),
>>>>                           "CDATA", oQueryLength);
>>>>
>>>>  at the package  org.biojava.bio.program.ssbind; in  
>>>> HeaderStAXHandler.java
>>>> 4)at the private class QueryIDStAXHandler at line 95 I changed the
>>>> method startelement
>>>>
>>>>        public void startElement(String            uri,
>>>>                                 String            localName,
>>>>                                 String            qName,
>>>>                                 Attributes        attr,
>>>>                                 DelegationManager dm)
>>>>        throws SAXException
>>>>        {
>>>>            ssContext.getSearchContentHandler().setQueryID 
>>>> (attr.getValue("id"));
>>>>            if (attr.getValue("queryLength") != null)
>>>>            {
>>>>                ssContext.getSearchContentHandler 
>>>> ().addSearchProperty("queryLength",
>>>> attr.getValue("queryLength"));
>>>>            }
>>>>        }
>>>>    }
>>>>
>>>> Now query length is a property of the annotation  of a blast  
>>>> result.
>>>> It is really fun to participate in the biojava project.
>>>>
>>>> Best regards,
>>>> Michael
>>>>
>>>>
>>>> On Dec 24, 2007 2:32 AM, Mark Schreiber  
>>>> <markjschreiber at gmail.com> wrote:
>>>>> Hi -
>>>>>
>>>>> We are currently merging the code base into subversion (from CVS)
>>>>> after this it will be possible to check in code again.  For small
>>>>> additions it is usually easier to post the code to the dev list  
>>>>> (in
>>>>> the body of the email as the list doesn't like attachments) or  
>>>>> send it
>>>>> to one of the regular committers and get them to add it.
>>>>>
>>>>> The JUnit tests are the standard test package. If you have  
>>>>> added new
>>>>> functionality it would be a good idea to add another test  
>>>>> method in
>>>>> the appropriate JUnit test to make sure it works (and continues to
>>>>> work in the future).
>>>>>
>>>>> - Mark
>>>>>
>>>>>
>>>>> On Dec 23, 2007 11:22 PM, Michael Gang <michaelgang at gmail.com>  
>>>>> wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I've now added the extraction of the query length.
>>>>>> Can someone explain me the procedure of checking in code to  
>>>>>> biojava ?
>>>>>> I ran the unit tests in the biojava distribution? Are there  
>>>>>> additional
>>>>>> tests available ?
>>>>>>
>>>>>> Best regards,
>>>>>> Michael
>>>>>>
>>>>>>
>>>>>> On Dec 21, 2007 9:59 AM, Mark Schreiber  
>>>>>> <markjschreiber at gmail.com> wrote:
>>>>>>> Hi -
>>>>>>>
>>>>>>> It is not required that you turn all Blast results into objects,
>>>>>>> because it is an event based parser you can do what you want  
>>>>>>> with the
>>>>>>> events including turning them into objects or echoing them to  
>>>>>>> STDOUT.
>>>>>>> Take a look at the examples in the cookbook.
>>>>>>>
>>>>>>> It may be that the query length is actually parsed but is not  
>>>>>>> passed
>>>>>>> onto the object model by the event listeners.
>>>>>>>
>>>>>>> - Mark
>>>>>>>
>>>>>>>
>>>>>>> On Dec 21, 2007 12:15 AM, Andreas Prlic <ap3 at sanger.ac.uk>  
>>>>>>> wrote:
>>>>>>>> Hi Michael,
>>>>>>>>
>>>>>>>> The blast parser (BlastLikeSaxParser) in BioJava has been  
>>>>>>>> around for
>>>>>>>> a while and is frequently being used to parse a variety
>>>>>>>> of different blast outputs. Still it is not complete and can  
>>>>>>>> not
>>>>>>>> parse PSI blast. We have had a number of request about it  
>>>>>>>> lately
>>>>>>>> so I suppose it needs a little maintenance now.
>>>>>>>>
>>>>>>>> To write a new blast parser from scratch will involve a  
>>>>>>>> significant
>>>>>>>> amount of time. It will take time to fix all the bugs, add  
>>>>>>>> support
>>>>>>>> for the different blast versions and write documentation.  
>>>>>>>> Much of
>>>>>>>> this is already available in BioJava, so I would prefer if  
>>>>>>>> you could
>>>>>>>> submit patches for
>>>>>>>> the current blast parser.  Would you also be interested to
>>>>>>>> collaborate in this direction?
>>>>>>>> Another feature that would be nice to add support for is the
>>>>>>>> possibility to send off blast searches to webservices...
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Andreas
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 20 Dec 2007, at 12:54, Michael Gang wrote:
>>>>>>>>
>>>>>>>>> Hi All,
>>>>>>>>>
>>>>>>>>> I used the interface of the java blast parser.
>>>>>>>>> I had mainly two problems with it:
>>>>>>>>> 1) The blast parser does not parse all the information (for  
>>>>>>>>> example
>>>>>>>>> query length)
>>>>>>>>> 2) The blast parser parses the whole blast report into a  
>>>>>>>>> list which
>>>>>>>>> eats a lot of memory.
>>>>>>>>>
>>>>>>>>> I would be interested to write and contribute a blast  
>>>>>>>>> parser which
>>>>>>>>> parses all the information of the blast and parses the blast
>>>>>>>>> iteratively.
>>>>>>>>> Something like the following code in bioperl (just in Java).
>>>>>>>>>   use Bio::SearchIO;
>>>>>>>>>     # format can be 'fasta', 'blast'
>>>>>>>>>     my $searchio = new Bio::SearchIO( -format => 'blastxml',
>>>>>>>>>                                       -file   =>  
>>>>>>>>> 'blastout.xml' );
>>>>>>>>>     while ( my $result = $searchio->next_result() ) {
>>>>>>>>>        while( my $hit = $result->next_hit ) {
>>>>>>>>>         # process the Bio::Search::Hit::HitI object
>>>>>>>>>            while( my $hsp = $hit->next_hsp ) {
>>>>>>>>>             # process the Bio::Search::HSP::HSPI object
>>>>>>>>>         }
>>>>>>>>>     }
>>>>>>>>>
>>>>>>>>> Would you be interested in such a contribution ?
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Michael
>>>>>>>>> _______________________________________________
>>>>>>>>> biojava-dev mailing list
>>>>>>>>> biojava-dev at lists.open-bio.org
>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>>>>>>>
>>>>>>>> --------------------------------------------------------------- 
>>>>>>>> --------
>>>>>>>>
>>>>>>>> Andreas Prlic      Wellcome Trust Sanger Institute
>>>>>>>>                               Hinxton, Cambridge CB10 1SA, UK
>>>>>>>>                               +44 (0) 1223 49 6891
>>>>>>>>
>>>>>>>> --------------------------------------------------------------- 
>>>>>>>> --------
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>  The Wellcome Trust Sanger Institute is operated by Genome  
>>>>>>>> Research
>>>>>>>>  Limited, a charity registered in England with number  
>>>>>>>> 1021457 and a
>>>>>>>>  company registered in England with number 2742969, whose  
>>>>>>>> registered
>>>>>>>>  office is 215 Euston Road, London, NW1 2BE.
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> biojava-dev mailing list
>>>>>>>> biojava-dev at lists.open-bio.org
>>>>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> biojava-dev mailing list
>>>>>> biojava-dev at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> biojava-dev mailing list
>>>> biojava-dev at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>>>
>>> _______________________________________________
>>> biojava-dev mailing list
>>> biojava-dev at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>
>> --------------------------------------------------------------------- 
>> --
>>
>> Andreas Prlic      Wellcome Trust Sanger Institute
>>                               Hinxton, Cambridge CB10 1SA, UK
>>                               +44 (0) 1223 49 6891
>>
>> --------------------------------------------------------------------- 
>> --
>>
>>
>>
>>
>> -- 
>> The Wellcome Trust Sanger Institute is operated by Genome  
>> ResearchLimited, a charity registered in England with number  
>> 1021457 and acompany registered in England with number 2742969,  
>> whose registeredoffice is 215 Euston Road, London, NW1 2BE.
>

-----------------------------------------------------------------------

Andreas Prlic      Wellcome Trust Sanger Institute
                               Hinxton, Cambridge CB10 1SA, UK
                               +44 (0) 1223 49 6891

-----------------------------------------------------------------------




-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 



More information about the biojava-dev mailing list