Text Mining and Knowledge Engineering

text mining and knowledge engineering



One of the biggest challenges in any science is that we don’t know what we know. Blue Brain’s open source text and data mining tools now allow anyone to navigate the knowledge contained in vast numbers of papers in an intuitive and interactive manner, and discover facts and non-trivial relations between terms for further analysis and investigation.


Prof. Henry Markram, Founder and Director of the Blue Brain Project


Blue Brain has built an open source two component framework – Blue Brain Search and Blue Graph, for Knowledge Graph guided literature review using its Machine Learning and Data and Knowledge Engineering expertise.



Blue Brain Search

Blue Brain Search is a text mining toolbox to perform semantic literature search and structured information extraction from text sources.


Blue Graph

Blue Graph is a Python framework for graph analytics and co-occurrence analysis consolidating capabilities from different graph processing backends under a unified API. Using Blue Graph, users can gain insights, uncover hidden patterns, infer implicit knowledge and build recommendations engines from graph-shaped data. Blue Graph is already successfully being used to perform Knowledge Graph guided literature reviews, which involves building, exploring and analyzing knowledge graphs from text.


Knowledge Graphs are often built from heterogeneous data and knowledge (i.e. data models such as ontologies, schemas) coming from different sources and often with different formats (i.e. structured, unstructured). Nexus Forge enables data scientists, data and knowledge engineers to address these challenges by uniquely combining under a consistent and generic Python Framework all necessary components to build and search a Knowledge Graph.


Using the Text and Data Mining tools to reveal how glucose helps the SARS-CoV-2 virus



In response to the COVID-19 pandemic, the COVID-19 Open Research Dataset (CORD-19), the most extensive coronavirus literature collection was made open access and therefore, available for data mining.


Blue Brain Search and Blue Graph enabled Blue Brain scientists to read, analyze and synthesize the knowledge contained in over 240’000 open access scientific papers available in the CORD-19 dataset v47.


The resulting study – A machine-generated view of the role of Blood Glucose Levels in the severity of COVID-19 was published in Frontiers in Public Health.




The Knowledge Graph guided review of the COVID-19-related literature performed enabled the BBP scientists to reveal how glucose helps the SARS-CoV-2 virus.