text-mining tools and resources

TABLE 1 | Examples of text-mining tools and resources


Text-mining solutions for biomedical research: enabling integrative biology

Dietrich Rebholz-Schuhmann, Anika Oellrich & Robert Hoehndorf

Nature Reviews Genetics 13, 829-839 (December 2012)


Name Content Input Description URL
Information retrieval
PubMed Abstracts Standard query Retrieves abstracts of scientific publications according to user query. Results are provided as a list and can be further filtered with Medical Subject Headings (MeSH) terms and an advanced search functionality http://www.ncbi.nlm.nih.gov/pubmed
GoPubMed Abstracts Standard query Retrieves publications from MEDLINE and additional functionality by classifying publications according to Gene Ontology concepts to allow improved screening of results http://www.gopubmed.com/web/gopubmed
RefMED Any text Standard query Allows user to submit feedback and consequently learns how to search PubMed for relevant articles according to feedback provided http://dm.postech.ac.kr/refmed
UK PubMed Central (UKPMC) Full text Standard query Retrieves full-text documents from PubMed and mines the documents for mentions of genes, drugs and Gene Ontology concepts using the Whatizit infrastructure http://ukpmc.ac.uk
PolySearch Abstracts, databases Standard query Retrieves information (such as documents and database entries) according to particular patterns of queries. Supports 50 different classes of queries http://wishart.biology.ualberta.ca/polysearch/index.htm
Information extraction
Textpresso Full text Standard query Provides extracted statements containing entities of interest on a subset of full text articles. A subset of articles is determined by Textpresso itself: for example, only mouse- or worm-specific articles http://www.textpresso.org
CoPub Abstracts Concepts or identifiers Retrieves co-occurring biomedical concepts from MEDLINE abstracts. The user specifies a list of concepts or identification numbers and retrieves back an overview about co-occurring concepts divided into categories http://services.nbic.nl/copub/portal
iHOP Abstracts Standard query Processes MEDLINE abstracts and generates a hyperlinked set of data for protein interactions. iHOP provides interactive functionality for searching genes and related information http://www.ihop-net.org/UniPub/iHOP
Reflect Any text Proteins Processes documents to highlight proteins and small molecules in the document and to link the entity to reference data resources http://reflect.embl.de
Open Biomedical Annotator Any text Ontologies and configuration parameters Processes documents to annotate text spans with ontology concepts. Covers all ontologies provided from the BioPortal Web page http://bioportal.bioontology.org/annotator
Side Effect Resource (SIDER) Holds information about the side effects of drugs extracted from drug leaflets and scientific literature http://sideeffects.embl.de
PharmGKB Provides information about the influences of genetic variation on drug responses. Information is extracted from scientific literature and is partially curated http://www.pharmgkb.org
BioCaster Retrieves disease relevant information from Twitter tweets and shows current hotspots of disease outbreaks on an interactive map http://born.nii.ac.jp
Transcript Based Isoform Interaction Database (TBIID) Provides information on human protein isoforms and their differential interactions http://tbiid.emu.edu.tr
STITCH Holds known and predicted interactions of small molecules and proteins, partially derived from scientific literature http://stitch.embl.de
The table gives an overview of data resources and tools that are available to the public. For each category, a selection has been chosen to demonstrate the purpose of that category.