Some useful Links about bio-informatics
Quosa
http://www.quosa.com/downloads.html
NCBI is here
http://www.ncbi.nlm.nih.gov/
PubMed is here:
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
The E-utilities for automated extraction of metadata and many other things is here
http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html
xpdf for text extraction from pdf's
http://www.foolabs.com/xpdf/download.html
that XML marked up full text corpus is here
http://www.biomedcentral.com/info/about/datamining/