Question

我有一个包含大量文件的文件夹，例如file_1.gz到file_250.gz并且正在增加。

搜索它们的zgrep命令就像：

zgrep -Pi "\"name\": \"bob\"" ../../LM/DATA/file_*.gz

我想在python子进程中执行此命令，如：

out_file = os.path.join(out_file_path, file_name)
search_command = ['zgrep', '-Pi', '"name": "bob"', '../../LM/DATA/file_*.gz']
process = subprocess.Popen(search_command, stdout=out_file)

问题是创建了out_file但它是空的并且引发了这些错误：

<type 'exceptions.AttributeError'>
'str' object has no attribute 'fileno'

解决方案是什么？

Answer 1

您需要传递文件对象：

process = subprocess.Popen(search_command, stdout=open(out_file, 'w'))

引用manual，强调我的：

stdin，stdout和stderr分别指定执行程序的标准输入，标准输出和标准错误文件句柄。有效值为 PIPE ，现有文件描述符（正整数），现有文件对象和无。 PIPE表示应该创建一个新的子管道。使用默认设置None，不会发生重定向;子项的文件句柄将从父项继承。

结合LFJ的答案 - 建议使用便利功能，并且需要使用shell=True使通配符（*）工作：

subprocess.call(' '.join(search_command), stdout=open(out_file, 'w'), shell=True)

或者当你正在使用shell时，你也可以使用shell重定向：

subprocess.call("%s > %s" % (' '.join(search_command), out_file), shell=True)

Answer 2

有两个问题：

您应该使用有效的.fileno()方法而不是文件名
shell扩展*但是除非你问，否则子进程不会调用shell。您可以使用glob.glob()手动扩展文件模式。

示例：

#!/usr/bin/env python
import os
from glob import glob
from subprocess import check_call

search_command = ['zgrep', '-Pi', '"name": "bob"'] 
out_path = os.path.join(out_file_path, file_name)
with open(out_path, 'wb', 0) as out_file:
    check_call(search_command + glob('../../LM/DATA/file_*.gz'), 
               stdout=out_file)

Answer 3

如果您想执行shell命令并获取输出，请尝试使用subprocess.check_output()。它非常简单，您可以轻松地将输出保存到文件中。

command_output = subprocess.check_output(your_search_command, shell=True)
with open(out_file, 'a') as f:
    f.write(command_output)

Answer 4

我的问题包括两部分：

第一部分也由@liborm回答
第二部分与zgrep尝试搜索的文件有关。当我们自动编写像 zgrep“pattern”path / to / files / * .gz 这样的命令时删除所有以.gz结尾的文件的 * .gz 。当我在子进程中运行命令时，没有人用真实文件替换 * .gz ，结果错误 gzip：../../ LM / DATA / file_ *。没有这样的文件或目录引发。所以解决了它：
```
for file in os.listdir(archive_files_path):
    if file.endswith(".gz"):
        search_command.append(os.path.join(archive_files_path, file))
```

执行zgrep命令并将结果写入文件

4 个答案: