text-mining tools and resources
TABLE 1 | Examples of text-mining tools and resources
FROM THE FOLLOWING ARTICLE:
Text-mining solutions for biomedical research: enabling integrative biology
Dietrich Rebholz-Schuhmann, Anika Oellrich & Robert Hoehndorf
Nature Reviews Genetics 13, 829-839 (December 2012)
doi:10.1038/nrg3337
Name | Content | Input | Description | URL |
---|---|---|---|---|
Information retrieval | ||||
PubMed | Abstracts | Standard query | Retrieves abstracts of scientific publications according to user query. Results are provided as a list and can be further filtered with Medical Subject Headings (MeSH) terms and an advanced search functionality | http://www.ncbi.nlm.nih.gov/pubmed |
GoPubMed | Abstracts | Standard query | Retrieves publications from MEDLINE and additional functionality by classifying publications according to Gene Ontology concepts to allow improved screening of results | http://www.gopubmed.com/web/gopubmed |
RefMED | Any text | Standard query | Allows user to submit feedback and consequently learns how to search PubMed for relevant articles according to feedback provided | http://dm.postech.ac.kr/refmed |
UK PubMed Central (UKPMC) | Full text | Standard query | Retrieves full-text documents from PubMed and mines the documents for mentions of genes, drugs and Gene Ontology concepts using the Whatizit infrastructure | http://ukpmc.ac.uk |
PolySearch | Abstracts, databases | Standard query | Retrieves information (such as documents and database entries) according to particular patterns of queries. Supports 50 different classes of queries | http://wishart.biology.ualberta.ca/polysearch/index.htm |
Information extraction | ||||
Textpresso | Full text | Standard query | Provides extracted statements containing entities of interest on a subset of full text articles. A subset of articles is determined by Textpresso itself: for example, only mouse- or worm-specific articles | http://www.textpresso.org |
CoPub | Abstracts | Concepts or identifiers | Retrieves co-occurring biomedical concepts from MEDLINE abstracts. The user specifies a list of concepts or identification numbers and retrieves back an overview about co-occurring concepts divided into categories | http://services.nbic.nl/copub/portal |
iHOP | Abstracts | Standard query | Processes MEDLINE abstracts and generates a hyperlinked set of data for protein interactions. iHOP provides interactive functionality for searching genes and related information | http://www.ihop-net.org/UniPub/iHOP |
Reflect | Any text | Proteins | Processes documents to highlight proteins and small molecules in the document and to link the entity to reference data resources | http://reflect.embl.de |
Open Biomedical Annotator | Any text | Ontologies and configuration parameters | Processes documents to annotate text spans with ontology concepts. Covers all ontologies provided from the BioPortal Web page | http://bioportal.bioontology.org/annotator |
Database | ||||
Side Effect Resource (SIDER) | Holds information about the side effects of drugs extracted from drug leaflets and scientific literature | http://sideeffects.embl.de | ||
PharmGKB | Provides information about the influences of genetic variation on drug responses. Information is extracted from scientific literature and is partially curated | http://www.pharmgkb.org | ||
BioCaster | Retrieves disease relevant information from Twitter tweets and shows current hotspots of disease outbreaks on an interactive map | http://born.nii.ac.jp | ||
Transcript Based Isoform Interaction Database (TBIID) | Provides information on human protein isoforms and their differential interactions | http://tbiid.emu.edu.tr | ||
STITCH | Holds known and predicted interactions of small molecules and proteins, partially derived from scientific literature | http://stitch.embl.de | ||
The table gives an overview of data resources and tools that are available to the public. For each category, a selection has been chosen to demonstrate the purpose of that category. |