Question

我刚开始学习Hadoop。在一本书中有一个我不完全理解的例子。

实施例

Consider processing 200 GB of data with 50 nodes, in which each node processes 4 GB 
of data located on a local disk. Each node takes 80 seconds to read the data (at the 
rate of 50 MB per second). No matter how fast we compute, we cannot finish in under 80 
seconds. Assume that the result of the process is a total dataset of size 200 MB, and 
each node generates 4 MB of this result. which is transferred over a 1 Gbps (1 MB per 
packet) network to a single node for display. It will take about 3 milliseconds (each 
1 MB requires 250 microseconds to transfer over the network, and the network latency 
per packet is assumed to be 500 microseconds (based on the previously referenced talk 
by Dr. Jeff Dean) to transfer the data to the destination node. Ignoring computational 
costs, the total processing time cannot be under 40.003 seconds.

在上面的例子中，我无法弄清楚如何导出传输时间3毫秒和总处理时间40.003秒。

Hadoop大数据 - 计算密集示例说明

0 个答案: