[Bioperl-l] Extract field from Medline

Barry Moore bmoore at genetics.utah.edu
Wed Dec 7 09:13:15 EST 2005


Andrej-

Doesn't really sound like you need Bioperl for this one - just some
loops and regular expressions.  Can't offer too much help without seeing
your file formats, but a boiler plate might look like this:

#!/usr/bin/perl

use strict;
use warnings;

my $file_terms = shift;
my $file_medline = shift;
open (TERM, $file_term) or die "Can't open TERM";
open (MEDL, $file_medline) or die "Can't open MEDL";

my @terms = <TERM>;

while (my ($pmid, $ti, $ab) = split <MEDL>) {
	for my $term (@terms) {
		if (/$term/ for ($pmid, $ti, $ab)) {
			print "$pmid\t$ti\t$ab";
		}
	}
}	

-----Original Message-----
From: bioperl-l-bounces at portal.open-bio.org
[mailto:bioperl-l-bounces at portal.open-bio.org] On Behalf Of Andrej
Kastrin
Sent: Wednesday, December 07, 2005 5:40 AM
To: bioperl-l at portal.open-bio.org
Subject: [Bioperl-l] Extract field from Medline

Hello all,

big problem for me, small for you (while I'm noob in perl). I have a 
list of terms (i.e. genes, gene products) in row data format. Now I have

to parse Medline (standard Medline format) and extract PMID, TI and AB 
(ID number, Title and Abstract) fields which involve any term in my term

list. I already transform Medline "multiline" format to "single" line, 
so there is only one line per each field.

How to start? Thanks for any suggesstion.
Best, Andrej

_______________________________________________
Bioperl-l mailing list
Bioperl-l at portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list