[Bioperl-l] PSI-BLAST Matrix Parser?

James Thompson tex at biosysadmin.com
Tue Sep 14 02:04:34 EDT 2004


Stefan,

That does make sense, I'll get started on it this weekend. I think I'll try
my hand at adding Bio::Matrix::PSM::ProtMatrix, a lot of the DNA stuff seems
hardcoded into the SiteMatrix object.

Thanks a lot for the help. :)

James Thompson



On Wed, 8 Sep 2004, Stefan Kirov wrote:

> This seems reasonable to me. The one thing you need to consider is the 
> structure that should contain the matrix. The current design of
> 
> Bio::Matrix::PSM::Psm and Bio::Matrix::PSM::SiteMatrix does not allow this as SiteMatrix is a DNA only object.
> There are two ways to go:
> Either change SiteMatrix to accept protein matrix data or add a protein matrix class to Bio::Matrix::PSM (say Bio::Matrix::PSM::ProtMatrix), which will hold the data and make Bio::Matrix::PSM::Psm inherit from the class and be able to contain the object (as it is actually a container right now).
> So you will have something like:
> my $psmIO= new Bio::Matrix::PSM::IO(-file=>$file, -format=>'psi-blast'); #this will call the actual parser (Bio::Matrix::PSM::IO::psiblast)
>   my $header=$psmIO->.... #I guess there will be some header data
> 
>   while (my $psm=$psmIO->next_psm) {
>    my $psimatrix=$psm->protmatrix; #This will be Bio::Matrix::PSM::ProtMatrix object 
>    $psimatrix->.....; #Now process the data parsed into this object through its methods...
>   }
> 
> If you do this maybe you should get an account and commit it yourself?
> Does this make sense to you?
> Stefan
> 
> James Thompson wrote:
> 
> >Stefan,
> >
> >Thanks for the response. For reading in the actual alignment I would use
> >Bio::AlignIO to read the PSI-BLAST output as it's just another alignment file,
> >but the matrix file that I'm talking about is slightly different. Now that
> >I've perused CVS more and learned more about how the Bio::Matrix::PSM modules
> >work, I think I have a more clear picture of what I'd like to do. 
> >
> >If you run PSI-BLAST with the -Q option, will take the matrix that it
> >used for the position-specific search and output it to a file. I've put up a
> >link to one of my matrix files up here if you'd like to look at it:
> >
> >http://bioinformatics.rit.edu/~tex/atp1.matrix
> >
> >Basically I'd like to make some Bio::Matrix::PSM::Psm objects (or at least
> >a PsmI-compliant object), and I think that the correct way to do this would
> >be to add a file format parser to Bio::Matrix::PSM::IO. Currently in Bioperl
> >there are three format parsers:
> >   - mast
> >   - meme
> >   - transfac
> >
> >None of these work with the PSI-BLAST matrix files.  I'd like to write a new
> >matrix file parser (perhaps called psi-blast?) in the spirit of the three other
> >parsers.
> >
> >If I were to write this, could someone commit it for me? 
> >
> >James Thompson
> >
> >On Tue, 7 Sep 2004, Stefan A Kirov wrote:
> >
> >  
> >
> >>I am not sure what object you are going to store your data in... Are you
> >>going to develop your own class to hold the data or use an existing one?
> >>Also is there any reason not to use Bio::AlignIO (it reads PSI-Blast as
> >>far as I know)?
> >>Stefan
> >>
> >>
> >>On Tue, 7 Sep 2004, James Thompson wrote:
> >>
> >>    
> >>
> >>>Dear Bioperl-ers,
> >>>
> >>>I'd like to parse the output of a PSI-BLAST matrix, and I was wondering if
> >>>there was a Bioperl way of parsing these files. If not, I'd like to make my
> >>>code general enough to be committed, and I'd like some advice on where exactly
> >>>to put such a module. From my cursory knowledge of Bioperl, I think that adding
> >>>another format parser to Bio::Matrix::PSM::IO would be a good way to go.
> >>>
> >>>I have a couple of questions:
> >>>- Does anyone know what the PSI-BLAST matrix format is called?
> >>>- Is this the correct place in which to put code for parsing this type of files?
> >>>
> >>>The file format represents a position-specific scoring matrix with some added
> >>>statistical information, here's a general overview of the information available
> >>>      
> >>>
> >>>from the matrix file:
> >>    
> >>
> >>>Last position-specific scoring matrix computed, weighted observed percentages
> >>>rounded down, information per position, and relative weight of gapless real
> >>>matches to p seudocounts.
> >>>
> >>>Any help is greatly appreciated.
> >>>
> >>>James Thompson
> >>>
> >>>_______________________________________________
> >>>Bioperl-l mailing list
> >>>Bioperl-l at portal.open-bio.org
> >>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>      
> >>>
> >
> >
> >  
> >
> 
> 



More information about the Bioperl-l mailing list