Question

我正在尝试使用子进程库从我的Python模块运行grep命令。因为，我在doc文件上执行此操作，我使用Catdoc第三方库来获取计划文本文件中的内容。我想将内容存储在一个文件中。我不知道我哪里出错了但程序无法生成纯文本文件并最终获得grep结果。我已经浏览了错误日志，但它是空的。谢谢你的帮助。

def search_file(name, keyword):
    #Extract and save the text from doc file
    catdoc_cmd = ['catdoc', '-w' , name, '>', 'testing.txt']
    catdoc_process = subprocess.Popen(catdoc_cmd, stdout=subprocess.PIPE,stderr=subprocess.PIPE, shell=True)
    output = catdoc_process.communicate()[0]
    grep_cmd = []
    #Search the keyword through the text file
    grep_cmd.extend(['grep', '%s' %keyword , 'testing.txt'])
    print grep_cmd
    p = subprocess.Popen(grep_cmd,stdout=subprocess.PIPE,stderr=subprocess.PIPE, shell=True)
    stdoutdata = p.communicate()[0]
    print stdoutdata

Answer 1

在UNIX上，指定shell=True将导致第一个参数被视为要执行的命令，所有后续参数都被视为shell本身的参数。因此，>将不会产生任何影响（因为使用/bin/sh -c，命令后的所有参数都将被忽略）。

因此，您应该实际使用

catdoc_cmd = ['catdoc -w "%s" > testing.txt' % name]

更好的解决方案可能只是从子进程'stdout中读取文本，并使用re或Python字符串操作处理它：

catdoc_cmd = ['catdoc', '-w' , name]
catdoc_process = subprocess.Popen(catdoc_cmd, stdout=subprocess.PIPE,stderr=subprocess.PIPE)
for line in catdoc_process.stdout:
    if keyword in line:
        print line.strip()

Answer 2

我认为你试图通过＆gt;到shell，但这不会像你做的那样工作。如果要生成进程，则应安排将其标准输出重定向。幸运的是，这很容易做到;您所要做的就是打开您希望输出写入的文件，并使用stdout关键字参数而不是PIPE将其传递给popen，这会导致它附加到您可以使用communic（）读取的管道

Python子进程库：从Python运行grep命令

2 个答案: