我在doc中运行教程,word count适用于本地文件,但我尝试
python mr.py -r hadoop 1.txt
然后它就会挂起。
当键盘中断时,日志为:
no configs found; falling back on auto-configuration
no configs found; falling back on auto-configuration
creating tmp directory /var/folders/zv/1hqhxh0n6m374cwzysmdn6zc0000gn/T/mr.yd006t.20150508.194506.047719
writing wrapper script to /var/folders/zv/1hqhxh0n6m374cwzysmdn6zc0000gn/T/mr.yd006t.20150508.194506.047719/setup-wrapper.sh
Using Hadoop version 2.7.0
Copying local files into hdfs:///user/yd006t/tmp/mrjob/mr.yd006t.20150508.194506.047719/files/
^CTraceback (most recent call last):
File "mr.py", line 16, in <module>
MRWordFrequencyCount.run()
File "/Library/Python/2.7/site-packages/mrjob/job.py", line 461, in run
mr_job.execute()
File "/Library/Python/2.7/site-packages/mrjob/job.py", line 479, in execute
super(MRJob, self).execute()
File "/Library/Python/2.7/site-packages/mrjob/launch.py", line 151, in execute
self.run_job()
File "/Library/Python/2.7/site-packages/mrjob/launch.py", line 214, in run_job
runner.run()
File "/Library/Python/2.7/site-packages/mrjob/runner.py", line 464, in run
self._run()
File "/Library/Python/2.7/site-packages/mrjob/hadoop.py", line 237, in _run
self._run_job_in_hadoop()
File "/Library/Python/2.7/site-packages/mrjob/hadoop.py", line 339, in _run_job_in_hadoop
self._process_stderr_from_streaming(master)
File "/Library/Python/2.7/site-packages/mrjob/hadoop.py", line 388, in _process_stderr_from_streaming
for line in treat_eio_as_eof(stderr):
File "/Library/Python/2.7/site-packages/mrjob/hadoop.py", line 381, in treat_eio_as_eof
yield iter.next() # okay for StopIteration to bubble up
KeyboardInterrupt
这就是mr.py中的事情
from mrjob.job import MRJob
class MRWordFrequencyCount(MRJob):
def mapper(self, _, line):
yield "chars", len(line)
yield "words", len(line.split())
yield "lines", 1
def reducer(self, key, values):
yield key, sum(values)
if __name__ == '__main__':
MRWordFrequencyCount.run()