Bioinformatics Challenge: Learning in Very High Dimensions with Very Few Samples
Description
Dedicated machine learning procedures have already become an integral part of modern genomics and proteomics. However, these very high dimensional and low learning sample tasks often stretch these procedures well beyond natural boundaries of their applicability. A few such challenges will be a subject of this series of lectures. We will start with a brief overview of classification of genomics (microarray) data. In particular we shall discuss, in some detail, examples of applications to cancer genomics and proteomics. Then we concentrate on a phenomenon of anti-learning, a case of supervised classification where standard supervised learning techniques systematically produce classifiers perfect on learning sample but with independent test error rates higher than that of the default (random) classification rule. The examples of natural and synthetic anti-learning data will be given and analysed from the stand point of implications to practical supervised and unsupervised classification. A series of practical tutorials will be organized in parallel. Participants will be exposed to classification of microarray data including first-hand experience with anti-learning.
| Slides | |
| 0:00 | Learning in Very High Dimensions with Very Few Samples |
| 0:51 | Part 1: Cancer genomics |
| 0:58 | Microarray background |
| 6:40 | Microarray analysis of gene expression |
| 10:13 | Principles of microarray |
| 10:58 | Reproducibility of array experiments |
| 12:06 | Data normalization |
| 13:18 | PCR amplification |
| 14:17 | PCR amplification |
| 14:43 | Low density Q-PCR Array (ABI) |
| 15:29 | Peter MacCallum Cancer Centre |
| 16:12 | Carcinomas of Unknown Primary |
| 16:58 | Cancer in brief |
| 18:04 | Data collection |
| 19:15 | Tumour samples of known origin |
| 19:43 | Developing a training set of expression profiles |
| 21:08 | Predictive modeling for gene expressions data |
| 23:42 | Classifier |
| 24:57 | Supervised Leaning Support Vector Machine (SVM) – Binary Classifier |
| 25:31 | Decision margin |
| 26:03 | Summary of cross-validation test |
| 28:12 | LOO CV Confusion Table for combined classifier |
| 29:32 | Test on metastases |
| 31:26 | Cross platform translation: form microarrays to low-density Q-PCR |
| 32:25 | Q-PCR site of origin diagnostic - Pilot Study |
| 32:57 | Comparison of cDNA microarray and Q-PCR data |
| 34:55 | Data sets |
| 35:02 | Data transformations for cross platform model transfer |
| 36:43 | Comparison of 5-class and 6-class models |
| 36:49 | Summary of SVM tests |
| 36:54 | Comparison of 5- and 6- class models |
| 38:51 | References |
| 39:06 | Acknowledgements |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
Watch videos: (click on thumbnail to launch)
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !






This is very good tutorial.