如何在不敲击磁盘的情况下从命令'stdout中读取单个文件的内容?
我想出了类似的东西:
def get_files_from(sha, files):
from subprocess import Popen, PIPE
import tarfile
p = Popen(["git", "archive", sha], bufsize=10240, stdin=PIPE, stdout=PIPE, stderr=PIPE)
tar = tarfile.open(fileobj=p.stdout, mode='r|')
p.communicate()
members = tar.getmembers()
names = tar.getnames()
contents = {}
for fname in files:
if fname not in names:
contents[fname] = None
continue
else:
idx = names.index(fname)
contents[fname] = members[idx].tobuf()
contents[fname] = tar.extractfile(members[idx]) #<--- HERE
tar.close()
return contents
问题是在标有
的行上添加.read()
来电
contents[fname] = tar.extractfile(members[idx]) #<--- HERE
会给出错误:
tarfile.StreamError:不允许向后搜索
那么如何获取文件的内容?
答案 0 :(得分:3)
您错误拼写了mode=
参数,而是写了more=
:
tar = tarfile.open(fileobj=p.stdout, mode='r|')
如果正确指定模式,则不会调用 .tell()
。 : - )
然后你必须循环 tarfile对象来提取成员,你不能从tarfile中读取任意文件:
for entry in tar:
# test if this is a file you want.
if entry.name in files:
f = tar.extractfile(entry)
您无法使用任何.getnames()
,.getmember()
或.getmembers()
方法,因为这些方法需要对文件进行全面扫描,将文件指针放在最后并让您无法使用阅读条目数据本身。
答案 1 :(得分:0)
对于任何有兴趣的人:
def get_files_from(sha, files):
from subprocess import Popen, PIPE
import tarfile
p = Popen(["git", "archive", sha], bufsize=10240, stdin=PIPE, stdout=PIPE, stderr=PIPE)
tar = tarfile.open(fileobj=p.stdout, mode='r|')
p.communicate()
contents = {}
doall = files == '*'
if not doall:
files = set(files)
for entry in tar:
if (isinstance(files, set) and entry.name in files) or doall:
tf = tar.extractfile(entry)
contents[entry.name] = tf.read()
if not doall:
files.discard(entry.name)
if not doall:
for fname in files:
contents[fname] = None
tar.close()
return contents
print get_files_from("a8c11fcee68881dfb86095aa36290fb304047cf1", ['README.MD', 'foo'])
print get_files_from("a8c11fcee68881dfb86095aa36290fb304047cf1", '*')
欢迎补丁!