event thumbnail image
Machine Learning Summer School 2005 - Canberra
Pascal

Bioinformatics Challenge: Learning in Very High Dimensions with Very Few Samples

author: Adam Kowalczyk, National ICT Australia

Description

Dedicated machine learning procedures have already become an integral part of modern genomics and proteomics. However, these very high dimensional and low learning sample tasks often stretch these procedures well beyond natural boundaries of their applicability. A few such challenges will be a subject of this series of lectures. We will start with a brief overview of classification of genomics (microarray) data. In particular we shall discuss, in some detail, examples of applications to cancer genomics and proteomics. Then we concentrate on a phenomenon of anti-learning, a case of supervised classification where standard supervised learning techniques systematically produce classifiers perfect on learning sample but with independent test error rates higher than that of the default (random) classification rule. The examples of natural and synthetic anti-learning data will be given and analysed from the stand point of implications to practical supervised and unsupervised classification. A series of practical tutorials will be organized in parallel. Participants will be exposed to classification of microarray data including first-hand experience with anti-learning.

You might be experiencing some problems with Your Video player.
Slides
0:00 Learning in Very High Dimensions with
Very Few Samples
0:51 Part 1: Cancer genomics
0:58 Microarray background
6:40 Microarray analysis of gene expression
10:13 Principles of microarray
10:58 Reproducibility of array experiments
12:06 Data normalization
13:18 PCR amplification
14:17 PCR amplification
14:43 Low density Q-PCR Array (ABI)
15:29 Peter MacCallum
Cancer Centre
16:12 Carcinomas of Unknown Primary
16:58 Cancer in brief
18:04 Data collection
19:15 Tumour samples of
known origin
19:43 Developing a training set of expression profiles
21:08 Predictive modeling for gene
expressions data
23:42 Classifier
24:57 Supervised Leaning
Support Vector Machine (SVM) – Binary Classifier
25:31 Decision margin
26:03 Summary of cross-validation
test
28:12 LOO CV Confusion Table for
combined classifier
29:32 Test on metastases
31:26 Cross platform translation:
form microarrays to low-density Q-PCR
32:25 Q-PCR site of origin diagnostic - Pilot Study
32:57 Comparison of cDNA microarray
and Q-PCR data
34:55 Data sets
35:02 Data transformations for
cross platform model transfer
36:43 Comparison of 5-class and
6-class models
36:49 Summary of SVM tests
36:54 Comparison of 5- and 6-
class models
38:51 References
39:06 Acknowledgements

Lecture rating

People found this lecture:
Worth seeing
because it is:
 Valuable and informative
Well presented
Easily understandable
Acceptably recorded
You need to login to cast your vote.

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

 Watch videos:   (click on thumbnail to launch)

Watch Part 1
Part 1 0:39:43
Slide Synchronization Windows Media video

!NOW PLAYING
Watch Part 2
Part 2 0:40:39
Slide Synchronization Windows Media video
Watch Part 3
Part 3 0:59:42
Slide Synchronization Windows Media video
Watch Part 4
Part 4 1:02:32
Windows Media video

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Reviews and comments:

Comment1 Rojan, July 30, 2007 at 9:48 a.m.:

This is very good tutorial.


Write your own review or comment: