Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application
Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application
We describe the WhiteText project, and its progress towards automatically extracting statements of neuroanatomical connectivity from text. We review progress to date on the three main steps of the project: recognition of brain region mentions, standardization of brain region mentions to neuroanatomical nomenclature, and connectivity statement extraction. We further describe a new version of our manually curated corpus that adds 2,111 connectivity statements from 1,828 additional abstracts. Cross-validation classification within the new corpus replicates results on our original corpus, recalling 67% of connectivity statements at 51% precision. The resulting merged corpus provides 5,208 connectivity statements that can be used to seed species-specific connectivity matrices and to better train automated techniques. Finally, we present a new web application that allows fast interactive browsing of the over 70,000 sentences indexed by the system, as a tool for accessing the data and assisting in further curation. Software and data are freely available at http://www.chibi.ubc.ca/WhiteText/.
- University of Toronto Canada
- University of British Columbia Canada
- University of British Colombia Canada
connectome, corpus, Neurosciences. Biological psychiatry. Neuropsychiatry, text mining, Neuroanatomy, neuroanatomical data mining, information retrieval, natural language processing, Natural Language Processing, RC321-571, Neuroscience
connectome, corpus, Neurosciences. Biological psychiatry. Neuropsychiatry, text mining, Neuroanatomy, neuroanatomical data mining, information retrieval, natural language processing, Natural Language Processing, RC321-571, Neuroscience
1 Research products, page 1 of 1
- IsRelatedTo
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).18 popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.Top 10% influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).Top 10% impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.Top 10%
