Architectures for distributed mining of big data thumbnail
slide-image
Pause
Mute
Subtitles not available
Playback speed
0.25
0.5
0.75
1
1.25
1.5
1.75
2
Full screen

Architectures for distributed mining of big data

Published on Jan 31, 20171264 Views

Related categories

Chapter list

Architectures for Distributed Mining of Big Data00:00
Big Data - 101:06
Big Data - 201:28
Big Data 6V’s02:02
Controversy of Big Data02:22
Batch and Streaming Engines04:23
Motivation MapReduce04:44
How Many Servers Does Google Have?05:26
Typical Big Data Challenges05:39
Google 200406:20
Jeff Dean07:07
Jeff Dean Facts - 107:59
Jeff Dean Facts - 208:17
MapReduce08:37
References08:42
Numbers Everyone Should Know (Jeff Dean)09:03
Typical Big Data Problem10:27
Functional Programming11:10
Map and Reduce functions11:35
Simplified view of MapReduce12:49
An Example Application: Word Count13:49
WordCount Example14:08
Simple MapReduce Variations15:03
MapReduce Framework - 116:14
MapReduce Framework - 217:20
Fault Tolerance18:08
Complete MapReduce Framework19:05
Partitioners and Combiners19:49
MapReduce Algorithms20:16
Simple MapReduce Algorithms - 120:19
Simple MapReduce Algorithms - 221:03
WordCount Example Revisited - 121:56
WordCount Example Revisited - 222:28
WordCount Example Revisited - 323:00
Average Computing Example - 123:31
Average Computing Example - 224:05
Average Computing Example - 325:11
Monoidify!25:51
Average Computing Example - 426:17
MapReduce Big Data Processing26:34
Apache Flink Motivation - 127:29
Apache Flink Motivation - 227:33
Real time computation: streaming computation28:00
Easy to Write Code - 128:49
Easy to Write Code - 229:35
What is Apache Flink?30:07
Batch and Streaming Engines31:35
Batch Comparison32:04
Streaming Comparison32:32
Spark Motivation33:17
Apache Spark - 133:20
What is Apache Spark33:52
Spark Ecosystem34:22
Spark API35:03
Apache Spark - 235:27
Apache Spark Project35:32
Resilient Distributed Datasets (RDDs)36:14
Spark API: Parallel Collections36:33
Spark API: External Datasets36:46
Spark API: RDD Operations36:57
Apache Spark Streaming37:15
Discretized Streams (DStreams) - 137:41
Discretized Streams (DStreams) - 237:55
Spark Streaming38:17
Spark SQL and DataFrames38:40
Spark Machine Learning Libraries - 139:34
Spark Machine Learning Libraries - 240:30
Spark GraphX - 140:48
Spark GraphX - 241:09
Apache Spark Summary42:37
Apache Kafka44:15
Apache Kafka from LinkedIn - 144:19
Apache Kafka from LinkedIn - 245:28
Apache Kafka from LinkedIn - 345:48
Apache Storm - 146:18
Apache S4 from Yahoo46:19
Apache Storm - 247:28
Storm47:35
Google Cloud DataFlow47:58
Google 200448:04
Google June 201448:24
Google Cloud Data Flow - 149:01
Google Cloud Data Flow - 249:22
Google Cloud Data Flow Paper50:12
Google Cloud Data Flow - 350:29
Apache Beam - 150:51
Apache Beam - 251:14
Architectures51:44
Lambda Architecture51:47
Kappa Architecture52:08
Samoa52:27
Thanks53:06