Question

我可以使用 Hadoop 的 Lua 编程语言吗？

若然，怎么样？

Answer 1

绝对:)您可以像这样使用Hadoop流：

在lua中创建从stdin读取的mapper和/或reducer脚本：

#!/usr/bin/env lua
while true do
  local line = io.read()
  if line == nil then break end

  # Do something with the incoming row

end

然后按照以下方式开展工作：

$HADOOP_HOME/bin/hadoop  jar $HADOOP_HOME/hadoop-streaming.jar \
    -input myInputDirs \
    -output myOutputDir \
    -mapper myMapper.lua \
    -reducer myReducer.lua \
    -file /local/path/to/myMapper.lua
    -file /local/path/to/myReducer.lua

在这里，您使用-mapper和-reducer指定mapper和reducer脚本，并将带有-file的脚本发送到分布式缓存，以便所有任务跟踪器都可以访问它。

使用流式传输时，您需要确保在运行任务跟踪器的所有计算机上都可以使用lua。

前段时间，我们尝试使用luajit（速度非常快）来从Pig流式传输。如果您使用Pig，您可以执行以下操作：

 OP = stream IP through `/local/path/to/script`;

这与使用lua作为映射器或缩减器不同，但根据操作的位置，mapper或reducer的输出将通过脚本流式传输。

Answer 2

我从未使用过Lua，也没有使用Hadoop的流媒体方面 - 所以这只是一个建议，不确定它是否会起作用：

看看http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/并使用Lua取代Python？

如果我打算尝试做你要问的事，那将是我的出发点。

Lua可以和Hadoop交谈吗？

2 个答案: