Analyzing and Linking Big Data with Stratosphere
published: July 16, 2012, recorded: June 2012, views: 4634
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Linking and Analyzing Big Data Summary of the presentation: In this talk, I will provide an overview of two projects at TU Berlin, and the research and innovation challenges in their intersection. Stratosphere (www.stratosphere.eu, funded by the German Research Foundation) is an open platform for Big Data Analytics. It features a cloud-enabled execution engine with flexible fault tolerance schemes, a novel programming model centered around second-order functions that extends MapReduce, and a cost-based query optimizer. Stratosphere is validated by several use-case scenarios, including climate data analysis, text mining in the Bioinformatics, and data cleansing on Linked Open Data. DOPA (an FP7 STREP project) focuses on linking large Data Pools of both structured and unstructured data using data supply chains. The goal is to multiply the utility of each individual service while simultaneously sharing the costs between them. This way DOPA lowers the barrier of entry for SMEs that need to perform advanced analytics across multiple data pools since the required input data as well as the processing environment do not have to be provided by the SME itself.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !