event thumbnail image
The 5th International Workshop on Mining and Learning with Graphs
Pascal

Learning and Charting Chemical Space with Strings and Graphs: Challenges and Opportunities for AI and Machine Learning

author: Pierre Baldi, University of California

Description

Informatics methods and computers have not yet become as pervasive in chemistry as they have in physics and biology. Drawing analogies from bioinformatics, key ingredients for progress in chemoinformatics are the availability of large, annotated databases of compounds and reactions, data structures and algorithms to efficiently search these databases, and computational methods to predict the physical, chemical, and biological properties of new compounds and reactions. We will describe how graph-based methods play a key role in the development of: (1) a large public database of compounds and reactions (ChemDB) and the underlying algorithms and representations; (2) machine learning kernel methods to predict molecular properties; and (3) the applications of these methods to drug screening/design problems and the identification of new drug leads against a major disease.

You might be experiencing some problems with Your Video player.
Slides
0:00 Charting Chemical Space with Computers: Challenges and Opportunities for AI and Machine Learning--Discovering New Drug Leads
0:23 Mother in Law Theorem:
0:39 Mother in Law Theorem:
1:02 Bioinformatics/Chemoinformatics Theorem
4:06 Chemoinformatics
6:26 “A mathematician is a machine that converts coffee into theorems” P. Erdos
7:25 Cholesterol
7:35 Aspirine
8:04 “A computer scientist …..…”
8:09 “A mathematician is a machine that converts coffee into theorems” P. Erdos
8:18 “A computer scientist …..…”
8:27 Chemical Space
12:01 Chemo/Bio Informatics
14:01 Data (examples)
17:26 ChemDB
19:47 ChemDB
20:35 ChemDB
21:03 ChemDB
21:32 ChemDB
21:46 Similarity: Data Representations
24:42 Fingerprint Representations
25:08 Fingerprint Compression
25:58 Power-Law Distributions
26:34 Power-Law Distribution Models
26:36 Lossless Compression Algorithms
27:58 Lossless Compression Algorithms - part 2
28:34 Finding a Good Similarity/Kernel - part 1
29:05 1D SMILES Kernel
29:54 2D-Labeled Graph
30:21 Similarity for Binary Fingerprints
31:12 Similarity Measures
31:38 3D Coordinate Kernel
32:34 Datasets
32:35 Example of Results
32:37 Results
32:38 Results
33:38 Results
36:12 Regression:Aqueous Solubility 30 folds cross-validation Delaney Dataset: 1440 Examples
36:47 XLogP 40 folds cross-validation Dataset size: 1991
36:53 HIV Competition
37:28 Additional Representations
38:06 2.5D Surface Kernel
39:27 Molecular Representations and Kernels
39:39 The Conformer Problem
39:58 2.5D + Conformers = 3.5D
40:52 Additional Variations
42:05 Summary
43:42 Summary
46:15 Tuberculosis (TB): An old foe
46:36 TB: still a real threat, because…..
47:15 The Cell Wall: Key to Pathogen Survival
49:55 Structure of AccD5
50:19 Structure-Based Drug Design
50:20 1stDocking datasets by ICM
50:39 1stDocking datasets by ICM
51:05 Structure-Based Drug Design Identified AccD5 Inhibitors
52:08 Acknowledgements

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment: