=======================================================================
Reading Note of MapReduce: Simplified Data Processing on Large Clusters
=======================================================================
------
Origin
------
``map`` and ``reduce`` are primitives in Lisp.
``map`` example
::
(map 'list #'- '(1 2 3 4)) => (-1 -2 -3 -4)
``reduce`` example
::
(reduce #'* '(1 2 3 4 5)) => 120
Haskell has the same functions.
``map`` example
::
Prelude> map negate [1, 2, 3]
[-1,-2,-3]
In Haskell, reduce is called fold. Haskell has 2 kinds of folds: foldl and
foldr.
::
Prelude> foldl (+) 0 [1, 2, 3]
6
-------
Example
-------
::
map(String key, String value);
// key: document name
// value: document contents
for each word w in value:
EmitIntermeidate(w, "1");
reduce(String key, Iterator values):
// key: a word
// values: a list of counts
int result = 0;
for each v in values:
result += ParseInt(v);
Emit(AsString(result));
分享到:
相关推荐
这是谷歌三大论文之一的 MapReduce: Simplified Data Processing on Large Clusters 英文原文。我的翻译可以见https://blog.csdn.net/m0_37809890/article/details/87830686
来自于GOOGLE的mapreduce的开山之作,此文是原英文的中文版本,希望能互相参照,加深理解
MapReduce-Simplified Data Processing on Large Clusters.pdf MapReduce-Simplified Data Processing on Large Clusters.pdf
MapReduce的翻译,我只是个搬运工qwq
MapReduce: Simplified Data Processing on Large Clusters翻译
Google并行计算,分布式处理模型MapReduce: Simplified Data Processing on Large Clusters
Google的MapReduce并行计算原始论文详解。
Google那篇著名的论文的ppt,MapReduce开山之作,介绍了Google对MapReduce的实现。
MapReduce: Simplified Data Processing on Large Clusters from google.
MapReduce原始论文
google大数据3大论文之一mapreduce,里面介绍mr创造之初的构思和想法
MapReduce-Simplified_Data_Processing_on_Large_Clusters中文版
Sanjay Ghemawat published the seminal paper MapReduce: Simplified Data Processing on Large Clusters. Since then, technologies leveraging the concept started growing very quickly with Apache Hadoop ...
MapReduce: Simplified Data Processing on Large Clusters
开源如此繁荣,需要感谢Google的三篇论文:《The Google File System》、《MapReduce: Simplified Data Processing on Large Clusters》和《Bigtable: A Distributed Storage System for Structured Data》,Google...
《The Google File System》 《MapReduce: Simplified Data Processing on Large Clusters》 《Bigtable: A Distributed Storage System for Structured Data》
MapReduce是Hadoop提供的一套用于进行分布式计算的模型,本身是Doug Cutting根据Google的<MapReduce: Simplified Data Processing on Large Clusters>仿照实现的。 MapReduce由两个阶段组成:Map(映射)阶段和Reduce...
MapReduce_Simplified_Data_Processing_on_Large_Clusters