Unlocking the potential of large prospective biobank cohorts for -omics data analysis: aspects of study design, prediction and causality thumbnail
Pause
Mute
Subtitles
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Unlocking the potential of large prospective biobank cohorts for -omics data analysis: aspects of study design, prediction and causality

Published on Jul 18, 20161357 Views

Recent decade has seen a tremendous increase in availability of data from large population-based biobank cohorts. Such datasets include various types of -omics data (genomics, transcriptomics, metab

Related categories

Chapter list

Unlocking the potential of large prospective biobank cohorts for -omics data analysis00:00
Machine learning vs Statistics02:52
The potential in large prospective biobank cohorts03:38
Biobank cohorts have brought a new era…05:39
Estonian Biobank07:20
EGCUT cohort vs Estonian population08:34
A prospective cohort of 50000+ participants („Gene Donors“)09:14
Follow-up studies: statistical aspects10:04
Example - 111:21
Example - 212:22
Example - 312:50
Genetic predictors for survival/mortality – why needed?13:45
Mortality studies in population-based biobank cohorts – sampling and timescales15:34
Standard survival analysis approach16:35
Age as time scale17:28
Genetic predictors affect from birth on, should we start the age scale at 0? - 118:54
Genetic predictors affect from birth on, should we start the age scale at 0? - 219:03
Biobank recruitment and follow-up20:28
Observed follow-up times on age scale21:13
Most common analysis method21:16
Partial likelihood for the Cox model23:34
Results of a simulation study (true HR=2)24:47
But...simulation when HR is small (HR=1.05) - 228:56
Genetic predictors for mortality – more challenges in biobank data30:18
What happens if you use parental data?32:18
Some simulations33:53
Genetic predictors for mortality – methodological approaches?35:26
A two-step Cox modeling approach37:00
Comparison of p - values38:29
How to handle power issues? (low no of cases)41:50
Nested case-control design42:33
Example of the Estonian Biobank analysis43:57
Often cases are over-sampled, but this is not a nested case-control design45:32
Other aspects to consider47:13
Extreme cases and controls47:57
Some results…53:54
But...simulation when HR is small (HR=1.05) - 155:55
Part II Genetic (polygenic) risk scores01:00:45
Why is genetic risk important?01:01:28
How to measure genetic risk?01:02:54
Type 2 Diabetes01:03:22
Comparison of cohort-specific and meta-analysis effect estimates01:03:45
Genetic (polygenic) risk scores (GRS)01:04:26
GRS: questions to address01:04:57
Problem with p-value based selections: „winners curse“01:05:37
The „true GRS“…01:06:59
Doubly-weighted GRS01:08:01
GRS for Type 2 Diabetes: allele count vs weighted scores01:09:09
ROC curves (BMI=25..35)01:10:45
T2D prevalence in individuals aged 45-8001:11:05
Genetic risk score (GRS) for CAD and cardiovascular mortality in men01:11:40
Extreme cases and controls01:11:55
Part 3: aspects of causality01:12:10
What is a causal effect?01:16:37
How to estimate causal effects? - 101:17:59
How to estimate causal effects? - 201:19:33
Causal graphs (DAGs)01:20:21
How to estimate causal effects?01:20:52
Can genetics help us? The idea of Mendelian Randomization01:21:54
Example from recent literature01:22:52
Mendelian randomization (MR)01:23:53
MR– how does it work? - 101:24:22
Mendelian randomization example01:25:48
A general association structure with one genotype and two phenotypes01:26:09
Can we test pleiotropy? - 101:28:31
Can we test pleiotropy? - 201:28:49
Conclusions01:28:57
Conclusions II01:29:51
Collaborators01:30:24
European Mathematical Genetics Meeting 201701:30:51
What is estimated in the presence of pleiotropy?01:32:20
MR– how does it work? - 201:32:45