- This event has passed.
13/10/2021 – AI3SD Autumn Seminar Series I: Linked Data, Ontologies & Deep Learning
13th October 2021 @ 2:00 pm - 3:45 pmFree
Eventbrite Link: https://ai3sd-autumn-series-131021.eventbrite.co.uk
This seminar forms part of the AI3SD Online Seminar Series that will run across the autumn (from October 2021 to December 2021). This seminar will be run via zoom, when you register on Eventbrite you will receive a zoom registration email alongside your standard Eventbrite registration email. Where speakers have given permission to be recorded, their talks will be made available on our AI3SD YouTube Channel. The theme for this seminar is Linked Data, Ontologies & Deep Learning.
- 14:00-14:45: Automated Chemical Ontology Expansion using Deep Learning – Dr Janna Hastings (UCL)
- 14:45-15:00: Coffee Break
- 15:00-15:45: Towards Biological Plausibility Using Linked Open Data – Dr Egon Willighagen (Maastricht University)
Abstracts & Speaker Bios
- Automated Chemical Ontology Expansion using Deep Learning – Dr Janna Hastings: Ontologies provide a shared vocabulary and semantic resource for a domain. Manual construction enables them to achieve high quality and capture subtle semantic nuances, essential for wide acceptance and applicability across a community. However, the manual curation process does not scale for large domains. I will present a methodology for automatic ontology extension based on deep learning using ontology annotations, and show how we apply this methodology to the ChEBI ontology, a prominent reference ontology for life sciences chemistry. We used a Transformer-based deep learning architecture trained on the chemical structures from ontology leaf nodes, and the system learns to predict membership in multiple mid-level ontology classes as a multi-class classification task. Additionally, I will illustrate how visualizing the model’s attention weights can help to explain the results by providing insight into how the model made its decisions.
Bio: I am a computer scientist interested in developing artificial intelligence-based computational systems to support research across the biological and social sciences. I am particularly interested in the interface between data science, i.e. algorithms for deriving inferences and predictions based on structured and unstructured data, and knowledge science, i.e. research that amasses, integrates and harnesses what we already know and channels that back towards efforts to make novel discoveries, towards a genuinely cumulative discovery frontier. To this end I have actively contributed to research in computational knowledge representation and reasoning, to community-wide knowledge integration via building semantic standards, and to scientific discovery research using computational approaches across a range of domains.
- Towards Biological Plausibility Using Linked Open Data – Dr Egon Willighagen: Behind risk assessment is experimental evidence. Behind biological knowledge is primary literature. However, because the amount of knowledge keeps growing, our experimental technologies are advancing and getting increasingly complex, even experts can no longer keep up with the progress in mechanistic understanding, outside their increasingly specialistic domain. At the same time, the number of biological questions with a simple answer keeps dropping and many modern questions have complex answers. Access to the right facts at the right time needs a change of thinking. The idea of linking facts and data at a large scale was envisioned long ago, but only recently became viable, with the introduction of the semantic web and linked open data. These new technologies make it possible to easily link remote knowledge, taking advantage of globally unique identifiers and exact meaning with ontologies [1,2]. This presentation outlines how we applied these ideas to the life sciences in general and with applications to toxicology. Using eNanoMapper , WikiPathways , and Wikidata , it will show how semantic web approaches can be used to answer questions that are much harder to answer with older approaches. Examples will show 1. how we can use SPARQL to return all assay experiments for all types of metal oxides, 2. how biological pathway knowledge can be combined with knowledge from chemical databases, and 3. how we can find research about and scholars that study particular genes, proteins, or toxicants.
Samwald, M.et al. Linked open drug data for pharmaceutical research and development. Journal of Cheminformatics 3, 19 (2011)
Willighagen, E.L. et al. The ChEMBL database as linked open data. Journal of Cheminformatics 5, 23 (2013)
Hastings, J. et al. eNanoMapper: harnessing ontologies to enable data integration for nanomaterial risk assessment. Journal of Biomedical Semantics 6, (2015)
Waagmeester, A. et al. Using the Semantic Web for Rapid Integration of WikiPathways with Other Biological Online Data Resources. PLOS Comp Biology 12, e1004989 (2016)
Waagmeester, A. et al. Wikidata as a knowledge graph for the life sciences. eLife 9, e52614 (2020).
Bio: I study the role of machine representation of knowledge and hypothesis in life sciences, metabolomics, drug discovery, and toxicology, involving cheminformatics, chemometrics and semantic web technologies. In the past, I have applied research on this also to QSAR and crystallography. Open source programming and Open science is also my main hobby, resulting in participation in, amongst many others, Chemistry Development Kit, WikiPathways, Bioclipse, BridgeDb, and others.