Predicting anti-cancer molecule activity using machine learning algorithms
published: April 17, 2008, recorded: March 2008, views: 419274
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
In this paper we study the anti-cancer activity of - 4.000 unique compounds against a set of 60 cell lines (e.g. Leukemia, Prostate, Breast). Small molecules play an important role in biology as they can be used as building blocks for more complex molecules and also interact with proteins inhibiting or promoting their action. In this case the consequence of adding such a compound to a cell can be far reaching as the protein may be involved in a very complex chain reaction. As such it is possible to design small molecules which can be useful drugs. Here we concentrate only in predicting a property of a given molecule: whether it will show anti-cancer activity (measured as causing at least 50% cell growing inhibition) against a given cancerous cell line. This computational prediction is important as there are a growing number of small molecules in databases worldwide and the capacity for proper lab testing is limited. For instance, the In Vitro Cell Line Screening Project at the National Cancer Institute (NCI) can currently evaluate (only) up to 3000 compounds per year for potential anti-cancer activity. From a machine learning perspective, biological problems are a good application because datasets are abundant, the data is real, the type of algorithms most suitable for a particular problem may vary substantial and it is not unusual for a problem to highlight research needs in machine learning. Finally, helping to solve biological problems may have a big impact in the wider scientific community. The molecule dataset we used is publicly available at the NCI site. We applied a range of data mining classification algorithms to this problem: Decision Trees, Inductive Logic Programming and Support Vector Machines (SVMs). As molecular features used for the learning we have used molecular weight, octanol water partition coefficient (logp) and fragment counts. A fragment is a set of connected atoms where each atom in a fragment is simply identified by its type. (e.g. carbon). If we look at the molecule as a graph, the fragment list consists of all connected components with diameter two. The experiments demonstrate that our results using support vector machines (with RBF kernel) are identical to previous published state of the art work yielding an average 73% predictive accuracy (having 54% as the baseline). We noticed however, to our surprise, that if instead of using fragment counts we use only atom counts the results are nearly identical (about 1% less accuracy, although the diference is statistical significant). An important point that must be made is that, although numerical black box algorithms like SVMs tend to be slightly more accurate than logic models (Decision Trees and ILPs in this dataset have an accuracy 3% to 4% below SVMs), it is arguable the relevance of this predictive accuracy for important practical applications like drug design. In a drug design setting what is useful is to have a set of rules that describe what a "good" compound should look like. That goal is much easily achieved with a human readable logic model like the ones we also describe in the paper.
Download slides: licsb08_santos_pam_01.pdf (97.7 KB)
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !
Reviews and comments:
I am student
i need your full paper about Predicting anti-cancer molecule activity using machine learning algorithms.
Can anyone who is expert in fix the bugs because In my browser your video shows this message. "Your browser does not support playback of available video formats. Please install Adobe Flash player or upgrade to a more modern browser." I have mentioned it as quoted form so if anyone who no how to fix this video play error so tell me then I will watch this video. I am marketer doing work to sale phone cases by this website https://iwillbling.com/ and doing research for my learning because I love to read new things.
Nice!A very useful and informative lecture. I recently wrote a research paper on a similar topic, and it was exciting and challenging research. For this reason, I think no one will blame me for using the writing service I found through reviews <a href=https://www.writingjudge.com/>WritingJudge</a> . With the help of these guys, I wrote an excellent research paper and got a high score.
I was searching for this information on [url=https://www.google.com/]google[/url]
Thank you for this! https://www.google.com/
An insightful and very useful expression. I wrote a research paper recently about a similar problem, and research was exciting and difficult. Therefore, I trust that nobody can blame me for the writing service that I find in the reviews https://www.writingjudge.com/ . I wrote an outstanding research paper with the aid of these guys and got a high rating.
This is a very interesting topic. I spend a lot of time in my favorite casino and machine learning techniques are also used there. I am looking for the best casinos here https://qytozh.com/ . There it is written about the most profitable bonus offers that casinos present.
Thank you for highlighting this extremely important topic. I believe that we will find a 100% way to fight cancer and finally be able to defeat it once and for all. My father died of cancer and I understand how terrible this disease is. https://renasorganic.com/products/cbd... Therefore, I hope for science and medicine. After inventing the coronavirus vaccine, I think you can do anything.
Write your own review or comment: