17/11/2021 – AI3SD Autumn Seminar VI – Medicinal Chemistry : AI 4 Scientific Discovery

This event was the sixth of the AI3SD Autumn Seminar Series that was run from October 2021 to December 2021. This seminar was hosted online via a zoom webinar and the theme for this seminar was Medicinal Chemistry, and consisted of two talks on the subject. Below are the videos of the talk and speaker biographies. The full playlist of this seminar can be found here.

A vision of Medicinal Chemistry for the future – Dr Lewis Vidler

Lewis Vidler is a principal research scientist in computational medicinal chemistry at UCB, based in the UK. He completed an MChem Chemistry degree at the University of Oxford specializing in synthetic organic chemistry, followed by a PhD at the Institute of Cancer Research (ICR) in applied computational chemistry. After he finished his PhD he joined Eli Lilly and Company initially as a contractor before progressing to Senior Research Scientist and has primarily worked on active drug discovery projects in that time. For the past 17 months he has been working at UCB continuing to apply computational methods to drive drug discovery projects forward. Outside of direct project support he has significant interest in automating aspects of drug discovery and empowering others with data analytics tools.

Q & A

Q1. What would you tell your younger self to go off learn and train themselves up with if you could do it again?

My background is an interesting one and an unusual one and taking that risk, I don’t think necessarily what I would tell myself, but in some ways, I’m proud of some of those decisions and proud of taking a risk to go and do something else. Rather than going down a very specialised, narrow focus on the thing you’re currently doing, go and do something completely different. Go and build a mixed skill set that then enables you to operate at the interface between those two disciplines taking the positives of each one and combining it. Then see the things that those who only have one of those skill sets might not necessarily see. It’s kind of by chance I’ve ended up with this kind of mixed computational medicinal chemistry skill set and by slight design. I’ve certainly seen that the way I operate on projects is different to some of my more computational colleagues or medicinal colleagues and I would like to think that there’s some aspects of what the future could look like.

Q2. So, I recall when you launched your assistant to Eli Lilly and I remember when we had chatted about this before and you were saying some people are actually replying to the chat box going “thank you very much” which was great. I was just wondering in the complete cycle of doing all the tasks (and I will come on to talk about this). Is there a point where chemists went “no stop now, that’s too much, that’s too much automation”. So, you really encountered resistance?

I don’t think so. We never ran a huge number of those cycles because often the complexity of molecules being made on a project means that the intersection between computational design ideas and those that could be easily executed on the robot, often they didn’t intersect. So often the mode was, the assistant creates those ideas, ranks them, scores them, and then they would be sent to the team and the team would then act upon them if they wanted to. So, they went into the funnel of designs being considered and there are examples of where team members will go “oh yeah, we thought about doing that, but we’d parked it” for whatever reason, but seeing the predictions stacked up alongside it, that’s enough to then tip them over the edge to try. I think whenever you try and change things and do something new, you’ll always come up with some resistance, and that’s why I put ‘support from management’ on this slide, because you need that vision to say we are going to do things differently. We’re going to see how they were going to see if they’re better, but also being dissatisfied with the status quo which I think most drug discovery organisations are, in many ways, anything you do should always be unsatisfied with how you’re doing today and think about how you could be better in the future. But to try something you always carry some risk, and so you’ll always come up against some resistance to change people who’ll say, “what we do is fine”. So, if you want to drive improvement, which often management, say that’s what they would like to do. And so organisationally you think about how that all plays together nicely and to move in that direction.

What a Medicinal Chemist Needs to Know about Explainable Artificial Intelligence – Dr Alexander (Al) G. Dossetter

In 2012 Dr Al Dossetter co-founded MedChemica Limited centred around the technology of Matched Molecular Pair Analysis (MMPA) as a method of accelerating medicinal chemistry. MedChemica now licenses a suite of Artificial Intelligence databases, and tools on-line, for organisations to extract and share knowledge from their own data. The software and methodologies have been used by chemists in many pharmaceutical companies, universities and bio-techs to accelerate drug discovery programmes. Extending the methodology enabled MedChemica Limited to share medicinal chemistry knowledge between the research branches of AstraZeneca, Hoffman La Roche and Genentech. In addition MedChemica offers consultancy services on drug discovery project and Al has helped multiple project achieve their goals. Previous to MedChemica Al gained his PhD from Nottingham University and after post-doctoral research at Harvard University joined AstraZeneca (AZ). He spent 13 years in medicinal chemistry spread across oncology (hormonal and kinase inhibitors), inflammation (OA and RA, enzyme inhibitors and GPCR targets) and diabetes (obesity, GPCR and enzyme inhibitors), delivering multiple projects and candidate drugs.

Q & A

Q.1 We see non-additivity in many cases that limit the applicability of this approach. How can you explain that? Can you tell which cases this approach will work and where should we avoid it?

For our particular technology, we supply the stats data going forward and you have the ability to drill back to the original data, particularly for the matched pairs. This enables you to see the compounds that are most similar to what you’re working on. So, we argue further by saying “which of the applicable pairs that are nearer to the chemical matter that I’m working on?” That helps us, kind of, addressed that problem. But overall, we present the data, “this is what’s happened globally”. A lot of what we focus on, during our training, is to be able to go into that data and understand it, so users are able to make that judgment. To give you a sense of scale when I first started using matched pairs we spent about three days pulling all the data together, to then be able to make a decision about how applicable the pairs are. So now the first thing we’re doing is organising all that data to be able to make the better decision. But overall you do touch upon quite an important challenge in this world.

Q2. Do you work only with Pharmaceuticals? What chemical descriptions do you use for the compounds?

Our client base is large pharma all the way down to individuals in universities. Our tools are available online and you only need a web browser to use them. Optimized for Google Chrome, and it does work in Safari and Firefox and Edge. So, descriptors? On the slide I had earlier on, we actually encode the fragments of molecules with all other hydrogens in place as absolute structures for matched pair analysis. That is the only way it works; you need that level of precision. As a result, our databases are 500 gigabytes in size, basically there are 40 million chemical transformations encoded on there and that’s the way to make sure the computers aren’t biased. For the machine learning methods, and I briefly touched on it, the descriptors we first came up with produced a brilliant systems, amazing. We could take fragments with the linker to another fragment and all the hydrogens described. Fantastic, brilliant models, and very accurate predictions, but very easy to fall out of domain with a different chemical series. So, we changed the descriptors, and made them more simple, to the likes of hydrogen bond donor, hydrogen bond acceptor, aliphatic group, aromatic group and atom linkers. And by doing two descriptors with the linker chain between them, we encode much more accurately, but we have softened the accuracy of the model so that it means when a new molecule comes in that it’s never seen before it can provide some sort of prediction, and more importantly, the basis of how that predictions come about. So have explainable models. Just to build on this further, sometimes people go “do you not include molecular weight and do you not include lipophilicity, and types of descriptions like these”. No, we don’t, and that’s very deliberate, because what we found is you’re double counting the chemical groups. This leads to predictions with higher errors. Because we have been medicinal chemists in large pharma and still are working on live projects, and have been through years of QSAR and understand the domain and so be able to organize and think about descriptors very carefully before building models.

Feel free to drop us an email anytime and we can have a chat further on zoom or find me on LinkedIn!