我有一个逐行读取文本的脚本,稍微修改一行,然后将该行输出到文件中。我可以将文本读入文件中,问题是我无法输出文本。这是我的代码。
cat = subprocess.Popen(["hadoop", "fs", "-cat", "/user/test/myfile.txt"], stdout=subprocess.PIPE)
for line in cat.stdout:
line = line+"Blah";
subprocess.Popen(["hadoop", "fs", "-put", "/user/test/moddedfile.txt"], stdin=line)
这是我得到的错误。
AttributeError: 'str' object has no attribute 'fileno'
cat: Unable to write to output stream.
答案 0 :(得分:5)
stdin
参数不接受字符串。它应该是PIPE
,None
或现有文件(具有有效.fileno()
的内容或整数文件描述符)。
from subprocess import Popen, PIPE
cat = Popen(["hadoop", "fs", "-cat", "/user/test/myfile.txt"],
stdout=PIPE, bufsize=-1)
put = Popen(["hadoop", "fs", "-put", "-", "/user/test/moddedfile.txt"],
stdin=PIPE, bufsize=-1)
for line in cat.stdout:
line += "Blah"
put.stdin.write(line)
cat.stdout.close()
cat.wait()
put.stdin.close()
put.wait()
答案 1 :(得分:2)
快速,快捷地开展代码工作:
import subprocess
from tempfile import NamedTemporaryFile
cat = subprocess.Popen(["hadoop", "fs", "-cat", "/user/test/myfile.txt"],
stdout=subprocess.PIPE)
with NamedTemporaryFile() as f:
for line in cat.stdout:
f.write(line + 'Blah')
f.flush()
f.seek(0)
cat.wait()
put = subprocess.Popen(["hadoop", "fs", "-put", f.name, "/user/test/moddedfile.txt"],
stdin=f)
put.wait()
但我建议您查看hdfs / webhdfs python库。
例如pywebhdfs。