我正在尝试将列表中的依赖项添加到requirements.txt文件中,具体取决于软件将运行的平台。所以我写了下面的代码:
if platform.system() == 'Windows':
# Add windows only requirements
platform_specific_req = [req1, req2]
elif platform.system() == 'Linux':
# Add linux only requirements
platform_specific_req = [req3]
with open('requirements.txt', 'a+') as file_handler:
for requirement in platform_specific_req:
already_in_file = False
# Make sure the requirement is not already in the file
for line in file_handler.readlines():
line = line.rstrip() # remove '\n' at end of line
if line == requirement:
already_in_file = True
break
if not already_in_file:
file_handler.write('{0}\n'.format(requirement))
file_handler.close()
但是这段代码发生的事情是,当要在文件中已有的需求列表中搜索第二个需求时,for line in file_handler.readlines():
似乎指向列表中的最后一个元素。文件,所以新的要求实际上只与列表中的最后一个元素进行比较,如果它不是相同的,则添加它。显然,这导致在列表中复制了几个元素,因为只有第一个要求与列表中的所有元素进行比较。如何告诉python再次从文件顶部开始比较?
解决方案: 我收到了很多很棒的回复,我学到了很多,感谢伙计们。我最终结合了两个解决方案;一个来自Antti Haapala,另一个来自Matthew Franglen。我在这里展示最终代码以供参考:
# Append the extra requirements to the requirements.txt file
with open('requirements.txt', 'r') as file_in:
reqs_in_file = set([line.rstrip() for line in file_in])
missing_reqs = set(platform_specific_reqs).difference(reqs_in_file)
with open('requirements.txt', 'a') as file_out:
for req in missing_reqs:
file_out.write('{0}\n'.format(req))
答案 0 :(得分:1)
在迭代现有需求列表之前打开文件句柄。然后,您可以阅读每个需求的整个文件句柄。
文件句柄将在第一个要求之后完成,因为您尚未重新打开它。为每次迭代重新打开文件将非常浪费 - 将文件读入列表然后在循环内使用它。或做一组比较!
file_content = set([line.rstrip() for line in file_handler])
only_in_platform = set(platform_specific_req).difference(file_content)
答案 1 :(得分:1)
请勿尝试再次为每个要求读取文件。虽然追加确实适用于这个用例,但对于一般的修改,更容易:
所以例如
with open('requirements.txt', 'r') as fin:
requirements = [ i for i in (line.strip() for line in fin) if i ]
for req in platform_specific_req:
if req not in requirements:
requirements.append(req)
with open('requirements.txt', 'w') as fout:
for req in requirements:
fout.write('{0}\n'.format(req))
# or print(req, file=fout)
答案 2 :(得分:1)
您明确问题的答案:file_handler.seek(0)会将其搜索回文件的开头。
一些巧妙的改进:
您可以将文件处理程序本身用作迭代器,而不是调用readlines()方法。
如果您的文件太大而无法完全读入内存,那么直接迭代文件中的行就可以了 - 但您应该改变您的操作方式。按原样,您将针对每个需求迭代整个文件,但IO成本很高。您应该迭代这些行,并且对于每一行检查它是否是其中一个要求。像这样:
with open('requirements.txt', 'a+') as file_handler:
for line in file_handler:
line = line.rstrip()
if line in platform_specific_req:
platform_specific_req.remove(line)
for req in platform_specific_req:
file_handler.write('{0}\n'.format(req))
答案 3 :(得分:0)
我知道我回答得有点迟了,但我建议这样做,打开一次,阅读和追加。请注意,无论您的系统如何,这都应该适用于每个平台:
import os
def ensure_in_file(lines, file_path):
'''
idempotent function to append lines to a file if they're not already there
'''
with open(file_path, 'r+U') as f: # r+U allows append, Universal Newline mode
# set of all lines in the file, less newlines, and trailing spaces too.
file_lines = set(l.rstrip() for l in f)
# write lines not in the file, add the os line separator as you go
f.writelines(l + os.linesep for l in set(lines).difference(file_lines))
你可以测试一下
a_file = '/temp/temp/foo/bar' # insert your own file path here.
# with open(a_file, 'w') as f: # ensure a blank file
# pass
ensure_in_file(['this', 'that'], a_file)
with open(a_file, 'rU') as f:
print f.read()
ensure_in_file(['this', 'that'], a_file)
with open(a_file, 'rU') as f:
print f.read()
每个print语句都应该证明文件每行一次。