03/11/2021 – AI3SD Autumn Seminar IV: AI & ML 4 Drugs & Materials : AI 4 Scientific Discovery

This event was the forth of the AI3SD Autumn Seminar Series that was run from October 2021 to December 2021. This seminar was hosted online via a zoom webinar and the theme for this seminar was AI & ML 4 Drugs & Materials, and consisted of three talks on the subject. Below are the videos of the talks and speaker biographies. The full playlist of this seminar can be found here.

Combining robotics and Machine Learning for accelerated drug discovery – Dr Tom Fleming

Tom Fleming MChem is the COO of biotech platform company Arctoris, which he co-founded in Oxford in 2016. Tom’s background is in cancer research, having worked in academia as well as at leading CROs and pharmaceutical corporations. A chemical biologist by training, he has unique insights into preclinical drug discovery, including the critical steps from target identification and high-throughput screening up to lead optimization. Tom was a Fellow of the Royal Commission of 1851 at the University of Oxford, and is a SME Leader of the Royal Academy of Engineering.

Machine Learning and AI for Drug Design – Professor Ola Engkvist

Dr Ola Engkvist is head of Molecular AI in Discovery Sciences, AstraZeneca R&D. He did his PhD in computational chemistry at Lund University followed by a postdoc at Cambridge University. After working for two biotech companies he joined AstraZeneca in 2004. He currently lead the Molecular AI department, where the focus is to develop novel methods for ML/AI in drug design , productionalize the methods and apply the methods to AstraZeneca’s small molecules drug discovery portfolio. His main research interests are deep learning based molecular de novo design, synthetic route prediction and large scale molecular property predictions. He has published over 100 peer-reviewed scientific publications. He is adjunct professor in machine learning and AI for drug design at Chalmers University of Technology and a trustee of Cambridge Crystallographic Data Center.

Q & A

Q1: Any ideas on how techniques used in small drug molecules can be scaled to the protein level?

I think there’s quite a lot of things going on for antibody design. I think it’s a much more difficult problem you have less data and the algorithm probably needs to be very sophisticated. To move into peptide space, you can start to approach it in the same way There’s a lot of research going on and I think it’s fair to say they are a few years, behind the small molecules, but they are combining machine learning and AI with physics-based modelling and it will ultimately have an impact.

Q2: I thought transformers need a huge amount of training data. What do you use for training data when you are using a transformer for molecular optimisation?

In the article (https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00497-0), we looked to generate matched molecular pairs for the whole CHEMBL database and used it as training set. Clearly, it’s a fair point that the transformer needs a lot of data, but you can also apply pre-training tricks to learn it on unlabelled data and then you only need to have a small set of labelled data, but it worked very well when we generated matched molecular pairs from the whole CHEMBL database and we published the results earlier this year.

Q3: Great talk! Has this generative process described for creating structures from SMILES been tested for metal catalyst design? Would you expect this to perform with similar success?

We have not tested it but I don’t see any reason why it shouldn’t work, if you have a good dataset to start with, and you know what you would like to achieve you can score the proposed catalyst and it might very well work. I have not done it myself, but I don’t see any fundamental differences why you cannot train on completely different chemicals space. In drug like chemical space as well as for catalysts or inorganic compounds.

Q4: Great talk. Question about the MELLODDY project. Did they get it right that you used just very like a feedforward standard neural network but just very wide?

It’s a feed forward type of network, yeah.

Why was the motivation instead of, for example is not graph convolution some, you know a little bit more modern?

It’s multi-task learning so I would defend it, it’s quite modern. We have done a lot of tests of the different algorithms and I hope some will be published

Maybe the last, maybe a little bit controversial. Don’t you think that the pay-off you know the delta AUC that you get out of the massive effort? Yes, it’s statistically significant. But it’s rather minor?

I agree for year two we just showed the principle that it works, that you have a statistical difference. Year three is about optimizing the difference between the multi-pharma and single pharma models. And I think it will need to be meaningful differences between a single and multi-pharma model. So, say if you use it internally, we’d use the models together with REINVENTt. I would like to see different molecules design with a multi-pharma model versus a single-pharma model to call it a success. So, I think it’s fair to say we reached the year two milestone, but for year three to be a success we will need to see a significantly larger difference.

Q5: I know that your company is investing significantly in supporting universities in training students for the new world. What would be, in your view, of the ideal kind of training for the new version of a chemist that’s coming forward?

I think that they need to be more fluent in automation and data analysis. I don’t think everybody needs to be a fluent python programmer, but I think really to emphasise more automation and statistical analysis. I think that there’s also other aspects, it’s also important to understand that we also work with new drug modalities now in the pharma industry, including a lot of different variants of nucleotide therapy. So, I think it’s also important to get a broad education and courses, so the students actually understand that, that it’s not the only machine learning AI and Automation that is changing in the pharma at the moment it’s a much versus a broader view of drug modalities now then we had maybe 10 years ago.

I think it’s very interesting because we see this push for that the more computer friendly side and I think the students appreciate that and see what that leads to. But that’s not at the expense of learning that biochemistry side, which maybe not all chemistry courses can cover, and I think your point about automation is very interesting because yes, we’re upscaling. Many university labs and teaching laboratories are beginning to use much more modern equipment, but actually it’s about teaching the students how to automate things, which is a slightly different aspect as well. But of course the degree is limited in time, so we’ll have to try and work out how to give them the principles

There’s a lot of things happening at the same time and to be able to cover that in a good way and also keep the basic understanding it’s definitely a challenge, I agree.

But one we need to face, I mean clearly people work in teams they don’t have to be experts at everything, but they have to know enough to be able to work in the team. I think it’s that that’s what we should be aiming for.

Absolutely.

Q6: As the reproducibility is sometimes a problem with already existing data, training neural network with such defective data may lead to a bad prediction? How do you think it can be addressed?

I think uncertainty quantification is very important. We need to apply it much more consistently so it’s not only about the accuracy, it’s also about the quantifying the answer alternatives. Also, of course, interpretability can help that the model prediction is for the right reason. But, of course in the end it is data from experiments. It is the experimental settings, and we have to explore method that take the experimental uncertainty into account. Like a version of probabilistic random forest. We tried actually to model the probability distribution from the experiment which you can do with the method. I think in the end we are still in the in a lucky situation. If we do a wrong prediction, we might synthesize the wrong molecule even though you want to synthesize the right molecule, but if it happens that you synthesized the wrong molecule the damage is limited. It is much different if you work in the clinical setting where you need to really be 100% that you make the right decision. We can do the wrong prediction once in a while for the next molecule to make, but we should of course do much more right predictions than wrong predictions.

Accelerating design of organic materials with machine learning and AI – Professor Olexandr Isayev

Olexandr Isayev is an Assistant Professor at the Department of Chemistry at Carnegie Mellon University. In 2008, Olexandr received his Ph.D. in computational chemistry. He was Postdoctoral Research Fellow at the Case Western Reserve University and a scientist at the government research lab. During 2016-2019 he was a faculty at UNC Eshelman School of Pharmacy, the University of North Carolina at Chapel Hill. Olexandr received the “Emerging Technology Award” from the American Chemical Society (ACS) and the GPU computing award from NVIDIA. The research in his lab focuses on connecting artificial intelligence (AI) with chemical sciences.

Q & A

Q1: How do you treat long range interaction within the descriptor? Or those are neglected entirely?

So there are a few approximations in the original ANI neural network, long range interactions are truncated to six Angstroms. In new AIMNet architecture, it’s all data driven, i.e. no physical descriptor and we do not have explicit equation. It’s all learning implicitly by the neural network, by the design because essentially one little atomic environment passes message from another, and therefore you can feel the presence of another environment in a certain distance. Therefore, it’s implicit, we don’t have a specific operation, neural network essentially learns that by itself, end to end.

Q2: Are the torsion energies exploring an “extrapolation” behaviour of the network? As I understand, the sampling is done close to eq. geometries.

All examples I showed you is a here or there, they all extrapolation. So, we train a set of small organic molecules and up to 15 heavy atoms. These examples were specifically selected were molecules not present in the training data, and therefore it’s all extrapolation and you can think about that building this map you can use it as a DFT or you can use neural network and obtain very similar answer.

Q3: For the 19F MRI agent project, did you predict water solubility with free-energy perturbation or something else?

In the first project the solubility was predicted through a straightforward ML model. It was a binary classification and the reason we use it as the binary was because it was difficult to measure solubility under pandemic conditions. What we did is essentially qualitative experiment, if a specified quantity of polymer diluted in the solvent, and solution was clear, we call it soluble. If it was cloudy or polymer precipitated, we called it insoluble. And hence we predicted solubility qualitatively, there was no simulation. It’s trained to just a simple experiment.

Q4: How well do the models work on simulating enzyme catalysis? Have you explored such avenue?

Out of the box, it probably will not work for catalysis, because the neural network was trained only on ligands i.e. small molecules, and at this point there’s no coupling to the protein. But what we are working on right now extends our small molecule forcefield to include proteins. And then you can do the simulation. The force field is reactive in principle, you can describe reaction if It’s been trained to see chemical reactions, but at this point it totally fails because it’s never seen a catalysis and reaction, and how it happens.

Q5: Can it also be applied to bond breaking?

Yes, given the proper training. We’re working on fully reactive forcefield. But out of the box, for example learning model will not work because the training data does not have any reactive data. But this is more of a data problem rather than methodological problem. The neural network can describe, given the proper data, a chemical reactivity.

Q6: Can this approach be applied to proteins? Remodel flexible chains or even fold proteins?

Yes, again, it is a work in progress, and we plan to release the counterpart of little network work projects. Stay tuned!