下载此文档

2_清华云计算课件--MapReduce原理和应用.ppt

文档分类：IT计算机 | 页数：约39页举报非法文档有奖

1/39

下载提示

1.该资料是网友上传的，本站提供全文预览，预览什么样，下载就什么样。
2.下载该文档所得收入归上传者、原创者。
3.下载的文档，不会出现我们的网址水印。

同意并开始全文预览

(约 1-6 秒)

1/39 下载此文档

文档列表 文档介绍

Mass Data Processing Technology on Large Scale Clusters Summer, 2007, Tsinghua University All course material (slides, labs, etc) is licensed under the Creative Commons Attribution License . Many thanks to Aaron Kimball & Sierra Michels-Slettvet for their original version 12 Some Slides from : Jeff Dean, Sanjay Ghemawat http://labs./papers/ Motivation 3 ? 200+ processors ? 200+ terabyte database ? 10 10 total clock cycles ? second response time ?5￠ average advertising revenue From: /~bryant/presentations/DISC- Motivation: Large Scale Data Processing ? Want to process lots of data ( > 1 TB) ? Want to parallelize across hundreds/thousands of CPUs ?… Want to make this easy 4 "Google Earth uses TB : 70 TB for the raw imagery and 500 GB for the index data." From: http://googlesystem./2006/09/how- much-data-does-google- MapReduce ? Automatic parallelization & distribution ? Fault-tolerant ? Provides status and monitoring tools ? Clean abstraction for programmers 5 Programming Model ? Borrows from functional programming ? Users implement interface of two functions: ? map (in_key, in_value) -> (out_key, intermediate_value) list ? reduce (out_key, intermediate_value list) -> out_value list 6 map ? Records from the data source (lines out of files, rows of a database, etc) are fed into the map function as key * value pairs: ., (filename, line). ? map() produces one or more intermediate values along with an output key from the input. 7 reduce ? After the map phase is over, all the intermediate values for a given output key bined together into a list ? reduce() combines those intermediate values into one or more final values for that same output key ?(in practice, usually only one final value per key) 8 Architecture 9 Parallelism ? map() functions run in parallel, creating different intermediate values from different input data sets ? reduce() functions also run in parallel, each working on a diff

2_清华云计算课件--MapReduce原理和应用来自淘豆网m.daumloan.com转载请标明出处.

2_清华云计算课件--MapReduce原理和应用.ppt

清华云计算课件--分布式计算

云计算技术原理和应用发展教材课件

并行计算云计算 MapReduce

云计算的应用精华课件

云计算中mapreduce性能优化及应用

云计算中mapreduce性能优化及应用

2_清华云计算课件--MapReduce原理和应用-课件【PPT讲稿】

2_清华云计算课件--MapReduce原理和应用

Google云计算原理-并行数据处理模型MapReduce-课件(PPT)

Google云计算原理和应用