我想向文本文件中相同的顺序行添加标识符。例如,我有以下输入文件:
Apple
Apple
Apple
Banana
Banana
Pineapple
Pineapple
Pineapple
Pineapple
我希望输出为:
Apple_number_1
Apple_number_2
Apple_number_3
Banana_number_1
Banana_number_2
Pineapple_number_1
Pineapple_number_2
Pineapple_number_3
Pineapple_number_4
如果这里的当前行和上一行相同,我有将打印一行的代码:
my_file=open('/Users/Jo/Desktop/for_building.txt')
lines=my_file.readlines()
def lines_equal(curr_line, prev_line, compare_char):
curr_line_parts = curr_line.split(' ')
prev_line_parts = prev_line.split(' ')
for item in zip(curr_line_parts, prev_line_parts):
if item[0].startswith(compare_char):
return item[0] == item[1]
results = []
prev_line = lines[0]
for line in lines[1:]:
results.append(lines_equal(line, prev_line, 'Z'))
prev_line = line
print(prev_line)
如何在末尾添加标识符?我认为我将使用while
循环。如果while循环陷入for
循环中,将变得很棘手。有解决这个问题的聪明方法吗?
答案 0 :(得分:4)
我将使用默认的dict,它将保存每行的计数,从零开始(默认),并在每次包围同一行时将其递增:
from collections import defaultdict
lineCounts = defaultdict(int)
for line in lines:
lineCounts[line] = lineCounts[line] + 1
print('{}_Number_{}'.format(line, lineCounts[line])
答案 1 :(得分:2)
from itertools import groupby
with open("data.txt", "r") as file:
lines = file.read().splitlines()
groups = [list(group) for _, group in groupby(lines)]
for group in groups:
for index, fruit in enumerate(group, start=1):
print(f"{fruit}_number_{index}")
输出:
Apple_number_1
Apple_number_2
Apple_number_3
Banana_number_1
Banana_number_2
Pineapple_number_1
Pineapple_number_2
Pineapple_number_3
Pineapple_number_4
答案 2 :(得分:1)
简单的迭代方法:
with open('file.txt') as f:
cnt = 1 # initial counter value
prev_line = None
for line in f:
if prev_line and line != prev_line: cnt = 1 # resetting counter
print('{}_number_{}'.format(line.strip(), cnt))
prev_line = line
cnt += 1
输出:
Apple_number_1
Apple_number_2
Apple_number_3
Banana_number_1
Banana_number_2
Pineapple_number_1
Pineapple_number_2
Pineapple_number_3
Pineapple_number_4