[Biojava-l] How do I get set of protein interactors for a specific protein from my code

jitesh dundas jbdundas at gmail.com
Tue Dec 1 13:39:56 UTC 2009


Dear Sir/Madam,

Many thanks for your help in solving my previous problem(XML parsing error).

I am writing a program to fetch details regarding any protein from the NCBI
database.  However, I do not know how to fetch the details of
Protein-Protein interactors for a specific protein.

For e.g.) If I click on protein p53, it should give me a list of protein
interactors for p53.
such as  shown in
http://www.hprd.org/interactions?hprd_id=01859&isoform_id=01859_1&isoform_name=

Can someone please tell me how I can get this data from NCBI or any other
source.  Which database should I consider and which are the params involved.


I am using Java as my language for doing so.

Thanks in  advance.

Regards,
Jitesh Dundas

On Tue, Nov 24, 2009 at 8:21 PM, Richard Holland
<holland at eaglegenomics.com>wrote:

> Jitesh - I forwarded your response to the list so that everyone can get the
> chance to reply.
>
> cheers,
> Richard
>
> Begin forwarded message:
>
> *From: *jitesh dundas <jbdundas at gmail.com>
> *Date: *24 November 2009 14:47:00 GMT
> *To: *Richard Holland <holland at eaglegenomics.com>
> *Subject: **Re: [Biojava-l] Java Error:- XML Parsing Error: XML or text
> declaration not at start of entity*
>
> Dear Sir,
>
> Thank you for your reply. I figured this problem out by sending records in
> small sets. e.g. 20 pages per page.
>
> It is like a pagination functionality. For each new page, we need to hit
> the URl..
>
> My functionality is working fine.I will be happy to share my code with you
> (and anyone) who needs it.
>
> I simply fetch data from the URL and write to an XML file. Next I just read
> the XML file and show them in the web page to the user.
>
> Again, I need to know how to fetch records for protein database. Two types
> of searches are needed I suspect.
>
> First we use the Esearch utility and then the Efetch utility to get the
> data of the specific protein..
>
> I welcome any suggestions on this !
>
> Thank you everyone for your help.
>
> Regards,
> Jitesh Dundas
>
> On 11/24/09, Richard Holland <holland at eaglegenomics.com> wrote:
>>
>> Your program takes an input 'txtURLString' - could you give an example of
>> the value that this usually contains? I suspect that this URL is where your
>> problem lies but without seeing an example value I couldn't say for sure.
>>
>> thanks,
>> Richard
>>
>> On 8 Nov 2009, at 10:22, jitesh dundas wrote:
>>
>> > Dear Sir,
>> >
>> > My program is working fine and can send me an xml file with 20
>> > records. However, it does not allow me to send large amounts of
>> > records.
>> >
>> > For e.g. if I enter "cancer" it will return only 20 records.
>> >
>> > Can you please tell me what I should do next to get all those records.
>> > Thank you in advance
>> >
>> > Regards,
>> > Jitesh Dundas
>> >
>> > On Sun, Nov 1, 2009 at 9:36 PM, Andreas Prlic <andreas at sdsc.edu> wrote:
>> >>
>> >> Hi Jitesh,
>> >>
>> >> It is hard to read your code with all the formatting off probably due
>> to email and many commented lines that don;t seem to get used. Can you
>> provide the stacktrace, so we can see what part of biojava is affected?
>> >>
>> >> Probably a good strategy to write and debug this is to simply the
>> problem into smaller steps. Try to first download the files you want to
>> parse and write the code to parse them from the local file.  That will avoid
>> any issues you might encounter with networking and server/client
>> communication. Once the parsing is working you could take it to the next
>> step and add the server communication...
>> >>
>> >> Andreas
>> >>
>> >>
>> >>
>> >>
>> >> On Sun, Nov 1, 2009 at 7:41 AM, jitesh dundas <jbdundas at gmail.com>
>> wrote:
>> >>>
>> >>> Hi friends,
>> >>>
>> >>> I am getting this error on doing a post(using the code below) to this
>> url->
>> >>>
>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=cancer&reldate=10
>> >>>
>> >>> I have written this code in .jsp file. Later I will change it into
>> servlet.
>> >>>
>> >>> Error:-
>> >>> XML Parsing Error: XML or text declaration not at start of entity
>> >>> Location:
>> >>>
>> http://localhost:8080/ProteomDb/ImportFromPubmed2.jsp?txtDbName=pubmed&txtTerm=cancer&txtreldate=10&comSDay=01&comSMonth=01&txtSYear=&comEDay=01&comEMonth=01&txtEYear=&txtURLString=http%3A%2F%2Feutils.ncbi.nlm.nih.gov%2Fentrez%2Feutils%2Fesearch.fcgi%3Fdb%3Dpubmed%26term%3Dcancer%26reldate%3D10&txtsubmit=Fetch+Data+From+NCBI
>> >>> Line Number 11, Column 1:<?xml version="1.0" ?><!DOCTYPE eSearchResult
>> >>> PUBLIC "-//NLM//DTD eSearchResult, 11 May 2002//EN" "
>> >>> http://www.ncbi.nlm.nih.gov/entrez/query/DTD/eSearch_020511.dtd
>> "><eSearchResult><Count>2034</Count><RetMax>20</RetMax><RetStart>0</RetStart><IdList>
>> >>>   <Id>19877350</Id>        <Id>19877304</Id>        <Id>19877297</Id>
>> >>>   <Id>19877284</Id>        <Id>19877271</Id>        <Id>19877265</Id>
>> >>>   <Id>19877250</Id>        <Id>19877245</Id>        <Id>19877226</Id>
>> >>>   <Id>19877210</Id>        <Id>19877179</Id>        <Id>19877175</Id>
>> >>>   <Id>19877161</Id>        <Id>19877159</Id>        <Id>19877158</Id>
>> >>>   <Id>19877123</Id>        <Id>19877122</Id>        <Id>19877120</Id>
>> >>>   <Id>19877119</Id>        <Id>19877118</Id>
>> >>> </IdList><TranslationSet><Translation>     <From>cancer</From>
>> >>> <To>"neoplasms"[MeSH Terms] OR "neoplasms"[All Fields] OR "cancer"[All
>> >>> Fields]</To>    </Translation></TranslationSet><TranslationStack>
>> >>> <TermSet>    <Term>"neoplasms"[MeSH Terms]</Term>    <Field>MeSH
>> >>> Terms</Field>    <Count>2082133</Count>    <Explode>Y</Explode>
>> >>> </TermSet>   <TermSet>    <Term>"neoplasms"[All
>> Fields]</Term>    <Field>All
>> >>> Fields</Field>    <Count>1634731</Count>    <Explode>Y</Explode>
>> >>> </TermSet>   <OP>OR</OP>   <TermSet>    <Term>"cancer"[All
>> Fields]</Term>
>> >>> <Field>All
>> Fields</Field>    <Count>902537</Count>    <Explode>Y</Explode>
>> >>> </TermSet>   <OP>OR</OP>   <OP>GROUP</OP>   <TermSet>
>> >>>
>> <Term>2009/10/22[EDAT]</Term>    <Field>EDAT</Field>    <Count>0</Count>
>> >>> <Explode>Y</Explode>   </TermSet>   <TermSet>
>> >>>
>> <Term>2009/11/01[EDAT]</Term>    <Field>EDAT</Field>    <Count>0</Count>
>> >>> <Explode>Y</Explode>   </TermSet>   <OP>RANGE</OP>   <OP>AND</OP>
>> >>> </TranslationStack><QueryTranslation>("neoplasms"[MeSH Terms] OR
>> >>> "neoplasms"[All Fields] OR "cancer"[All Fields]) AND 2009/10/22[EDAT]
>> :
>> >>> 2009/11/01[EDAT]</QueryTranslation></eSearchResult>
>> >>> ^
>> >>>
>> >>> As you can see, the XML output is coming fine but the above error does
>> not
>> >>> go..The output via this program should be just like hitting manually
>> the
>> >>> above URL in the browser..
>> >>> The browser is Mozilla Firefox.
>> >>>
>> >>> Code:-
>> >>>
>> >>> <%@ page language = "java" %>
>> >>> <%@ page import = "java.sql.*" %>
>> >>> <%@ page import = "java.util.*" %>
>> >>> <%@ page import = "java.io.*" %>
>> >>> <%@ page import="java.lang.*" %>
>> >>> <%@ page import="java.net.*" %>
>> >>> <%@ page import="java.nio.*" %>
>> >>> <%@ page contentType="text/xml; charset=utf-8" pageEncoding="UTF-8" %>
>> >>>
>> >>>
>> >>> <%
>> >>>
>> >>> try
>> >>> {
>> >>>    //String str = "<?xml version='1.0' ?>";
>> >>>    //out.println("<?xml version='1.0' encoding='utf-8' ?>");
>> >>>
>> >>>    Properties systemSettings = System.getProperties();
>> >>>    systemSettings.put("http.proxyHost", "********");
>> >>>    systemSettings.put("http.proxyPort", "******");
>> >>>    systemSettings.put("sun.net.client.defaultConnectTimeout",
>> "10000");
>> >>>    systemSettings.put("sun.net.client.defaultReadTimeout", "10000");
>> >>>
>> >>>     //out.println("Properties Set");
>> >>>    Authenticator.setDefault(new Authenticator()
>> >>>    {
>> >>>          protected PasswordAuthentication getPasswordAuthentication()
>> >>>          {
>> >>>                  return new PasswordAuthentication("**",
>> >>> "******".toCharArray()); // specify ur user name password of iitb
>> login
>> >>>          }
>> >>>    });
>> >>>
>> >>>
>> >>>   System.setProperties(systemSettings);
>> >>>   //out.println("After Authentication & Properties Settings");
>> >>>
>> >>>   //create xml file.
>> >>>   //the input to google api
>> >>>   //String textAreaContent = request.getParameter("text");
>> >>>   String textAreaContent = "This si a tst";
>> >>>
>> >>>   String str = "<?xml version='1.0' encoding='utf-8' ?>";
>> >>>
>> >>>   //xml file generation ends here..
>> >>>   //FetchDataFromNCBI_URLString.jsp
>> >>>   String URLString = request.getParameter("txtURLString").trim();
>> >>>
>> >>>   //URL url = new URL("
>> >>>
>> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=protein&term=BAA20519
>> >>> ");
>> >>>   URL url = new URL(URLString); //url string taken from user input.
>> >>>   HttpURLConnection connection = null;
>> >>>
>> >>>   connection = (HttpURLConnection) url.openConnection();
>> >>>   System.out.println("After open connection");
>> >>>   connection.setRequestMethod("POST");
>> >>>   connection.setDoInput(true);
>> >>>   connection.setDoOutput(true);
>> >>>
>> >>>   connection.setUseCaches(false);
>> >>>   connection.setAllowUserInteraction(false);
>> >>>   //connection.setFollowRedirects(true);
>> >>>   //connection.setInstanceFollowRedirects(true);
>> >>>   //System.out.println("Before-------------------");
>> >>>   connection.setRequestProperty ("Content-Type","text/xml;
>> >>> charset=\"utf-8\"");
>> >>>   //System.out.println("After-------------------");
>> >>>
>> >>>   //System.out.println(""+ connection.getOutputStream());
>> >>>
>> >>>   //System.out.println("After dataoutputstream..Line No-65");
>> >>>
>> >>>   //System.out.println("Response Code="+ connection.getResponseCode);
>> >>>
>> >>>   OutputStreamWriter dosout = new
>> >>> OutputStreamWriter(connection.getOutputStream());
>> >>>   //System.out.println("After dosout object..Line No-63");
>> >>>   //dosout.write(str);
>> >>>   dosout.close ();
>> >>>
>> >>>   BufferedReader in = new BufferedReader( new InputStreamReader(
>> >>> connection.getInputStream()));
>> >>>
>> >>>   String decodedString;
>> >>>   String tempstr = "";
>> >>>
>> >>>
>> >>>   while ((decodedString = in.readLine()) != null)
>> >>>   {
>> >>>       tempstr = tempstr + decodedString;
>> >>>       //out.println(decodedString);
>> >>>   }
>> >>>   out.println(tempstr);
>> >>>   in.close();
>> >>> }
>> >>> catch(Exception ex)
>> >>> {
>> >>> out.println("Exception->"+ex);
>> >>> PrintWriter pw = response.getWriter();
>> >>> ex.printStackTrace(pw);
>> >>> }
>> >>>
>> >>>
>> >>> %>
>> >>>
>> >>> Thanks in advance..
>> >>>
>> >>> Regards,
>> >>> JItesh Dundas
>> >>>
>> >>> _______________________________________________
>> >>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> >>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>> >>
>> >>
>> > <ImportFromPubmed3.jsp>_______________________________________________
>> > Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>> --
>> Richard Holland, BSc MBCS
>> Operations and Delivery Director, Eagle Genomics Ltd
>> T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
>> http://www.eaglegenomics.com/
>>
>>
>
> --
> Richard Holland, BSc MBCS
> Operations and Delivery Director, Eagle Genomics Ltd
> T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
> http://www.eaglegenomics.com/
>
>




More information about the Biojava-l mailing list