python通过增加no来管理冗余数据。在文本文件中

时间:2014-08-31 06:25:42

标签: python

我是python的新手。我有一个文本文件,我需要避免冗余而不是删除,但如果发现行相同,则通过增加文本文件中的数字。

请帮忙!答案将不胜感激! 例如随机文本文件:

hello ram1
hello ram1
hello gate1
hello gate1

预期产出:

hello ram1
hello ram2
hello gate1
hello gate2

2 个答案:

答案 0 :(得分:2)

使用正则表达式和collections.defaultdict

from collections import defaultdict
import re

numbers = defaultdict(int)
with open('/path/to/textfile.txt') as f:
    for line in f:
        line = re.sub(r'\d+', '', line.rstrip())  # Remove numbers.
        numbers[line] += 1  # Increment number for the same line
        print('{}{}'.format(line, numbers[line]))

UPDATE 使用切片表示法,字典。

import re

numbers = {}
with open('1.txt') as f:
    for line in f:
        row = re.split(r'(\d+)', line.strip())
        words = tuple(row[::2])  # Extract non-number parts to use it as key
        if words not in numbers:
            numbers[words] = [int(n) for n in row[1::2]]  # extract number parts.
        numbers[words] = [n+1 for n in numbers[words]]  # Increase numbers.
        row[1::2] = map(str, numbers[words])  # Assign back numbers
        print(''.join(row))

答案 1 :(得分:0)

import re

seen = {}
#open file
f = open('1.txt')
#read through file
for line in f:
    #does the line has anything?
    if len(line):
        #regex, for example, matching "(hello [space])(ram or gate)(number)"
        matched = re.match(r'(.*\s)(.*)(\d)',line)
        words = matched.group(1) #matches hello space
        key = matched.group(2) #matches anything before number
        num = int(matched.group(3)) #matches only the number

        if key in seen:
            # see if { ram or gate } exists in seen. add 1
            seen[key] = int(seen[key]) + 1
        else:
            # if { ram or gate } does not exist, create one and assign the initial number
            seen[key] = num
        print('{}{}{}'.format(words,key,seen[key]))