我在文件中有现有数据,如下所示:
d893ecee58ee4d6f1ca56a358d2e6287
69
ae0d10efd7663c734b9ea66cec5aaa44
100
c9136ba49f4b1a8e89d6ed35cac95f7c
100
67c1431d8a06d7b2e31g86874b757eeb
0
8478b9587875f65d5afe54f541bada61
11
我想要做的是在文档中搜索数值大于30的任何行,并在该数字上方打印该行。
这就是我现在所拥有的:
with open ('somefile.txt','r') as f, open('newfile.txt','w') as fnew:
for i, line in enumerate(f):
if line.startswith('1' or '2' or '3' or '4' or '5' or '6' or '7' or '8' or '9' or '10' or '11' or '12' or '13' or '14' or '15' or '16' or '17' or '18' or '19' or '20' or '21' or '22' or '23' or '24' or '25' or '26' or '27' or '28' or '29' or '30'):
fnew.write(line -1)
我知道这不是最干净的剧本,但我只想要一些有用的东西。
答案 0 :(得分:1)
这是我采取的方法,它假定您的输入数据是常规的:
with open('data.txt') as f:
while True:
try:
data = next(f).strip()
number = next(f).strip()
except StopIteration:
# EOF
break
number = int(number)
if number > 30:
# TODO: Write data to other file
print(data)
答案 1 :(得分:1)
如果你想要不同的方法,那么在dict中收集数据和值,然后如果key大于30则迭代dict然后获取值:
values={}
with open('test.txt','r') as f:
sub = []
for line in f:
sub.append(line.strip())
if len(sub) == 2:
if int(sub[1]) not in values:
values[int(sub[1])]=[sub[0]]
else:
values[int(sub[1])].append(sub[0])
sub=[]
for key,value in values.items():
if key>30:
print(key,value)
输出:
100 ['ae0d10efd7663c734b9ea66cec5aaa44', 'c9136ba49f4b1a8e89d6ed35cac95f7c']
69 ['d893ecee58ee4d6f1ca56a358d2e6287']
一步一步:
首先收集dict中的所有值和数字:
values={}
with open('test.txt','r') as f:
sub = []
for line in f:
sub.append(line.strip())
if len(sub) == 2:
if int(sub[1]) not in values:
values[int(sub[1])]=[sub[0]]
else:
values[int(sub[1])].append(sub[0])
sub=[]
它会给出:
{0: ['67c1431d8a06d7b2e31g86874b757eeb'], 11: ['8478b9587875f65d5afe54f541bada61'], 100: ['ae0d10efd7663c734b9ea66cec5aaa44', 'c9136ba49f4b1a8e89d6ed35cac95f7c'], 69: ['d893ecee58ee4d6f1ca56a358d2e6287']}
现在迭代此dict并获取该键的值,如果键> 30
答案 2 :(得分:0)
您可以使用列表推导和itertools.compress的组合来解决此问题。
您的文件格式必须严格遵守您在此处发布的内容。
\0
https://docs.python.org/3/library/itertools.html#itertools.compress
Compress使用2个列表并在list1上返回一个迭代器,它只包含在另一个列表中有import itertools
# zero-line is the ID, 1st line is the number. NO empty lines in between.
text = '''d893ecee58ee4d6f1ca56a358d2e6287
69
ae0d10efd7663c734b9ea66cec5aaa44
100
c9136ba49f4b1a8e89d6ed35cac95f7c
100
67c1431d8a06d7b2e31g86874b757eeb
0
8478b9587875f65d5afe54f541bada61
11
'''
lines = text.split("\n") # list of all lines - you can get that from
# file with readlines()
data = lines[0::2] # your data is in every 2nd line starting at 0
# your numbers are in every 2nd line starting on 1
nums = [1 if (int(x) > 30) else 0 for x in lines[1::2] ]
# the list comprehension creates a list of 0 and 1 - 1 if number > 30
# itertools.compress does the lifing for you
result = itertools.compress(data,nums)
print(list(result))
的元素。
没有itertools:
True
答案 3 :(得分:0)
您的方法存在一些问题。这是我的尝试:
def line_gt_30(line):
try:
return int(line.strip()) > 30
except ValueError:
return False
prev_line = None
with open ('somefile.txt','r') as f, open('newfile.txt','w') as fnew:
for line in f:
if line_gt_30(line) and prev_line is not None:
fnew.write(prev_line)
prev_line = line
我还没有真正测试过,但它应该可行。它类似于你的,但它解决了很多你的问题。
首先,你不需要在这里使用枚举。我不确定你使用print得到了什么(第-1行),但你需要将前一行保留在temp变量中,因为我们的文件现在是一个可迭代的,这意味着你不能使用索引来抓取像一个列表。
同样在开头('1'或'2'或'3'或'4'或......)相当于startswith('1')。你应该仔细阅读或做什么。
答案 4 :(得分:0)
你可以试试这个:
data = [i.strip('\n') for i in open('filename.txt')]
final_data = [data[i-1] for i in range(len(data)) if all(b.isdigit() for b in data[i]) and int(data[i]) > 30]
输出:
['d893ecee58ee4d6f1ca56a358d2e6287', 'ae0d10efd7663c734b9ea66cec5aaa44', 'c9136ba49f4b1a8e89d6ed35cac95f7c']