处理文本文件

时间:2014-05-07 04:12:06

标签: python text data-manipulation

我有两个文本文件:

FILE1.TXT:

a,1
b,3
c,5
d,-4

和file2.txt:

sample1
a,12 
b,10
c,4
d,6

sample2
a,5 
b,8
c,6
d,12

sample3
a,3 
b,6
c,9
d,10

我想要做的是从file2.txt中所有示例中的相应字母中减去file1.txt中给定字母的值,并创建多个文件,以便输出如下:

sample1的第一个文件,sample1.txt

sample1.txt
a,11 # 12-1 as 1 from file1.txt was subtracted from 12 in file2.txt
b,7 # 10-3
c,-1 # 4-5
d,10 # 6-(-4)

然后将sample2,sample2.txt:

的文件分开
sample2.txt
a,4 # 5-1 as 1 from file1.txt was subtracted from 5 in file2.txt
b,5 # 8-3
c,1 # 6-5
d,16 # 12-(-4)

和sample3相同。

我尝试循环遍历file2.txt,但由于我原来的file2.txt有超过1000个样本需要很长时间,有没有更快的pythonic方法呢?

干杯, 凯特

1 个答案:

答案 0 :(得分:1)

有趣!我们来看看。

设计非常简单。将文件读入字典并对字典执行操作,然后写出文件。

with open('file1.txt') as in_:
    mapping = {}
    for line in in_:
        key,value = line.strip().split(',')
        mapping[key] = int(value)

mapping现在是{"a":1, "b":3, "c":5, "d":-4}让我们读一下我们的文件。

values = {}
with open('file2.txt') as in_:
    for _ in range(3):
        # This is ugly, but it's a quick hack. I'd improve it later.
        cur_dict = next(in_).strip()
        values[cur_dict] = {}
        for __ in range(4):
            key, value = next(in_).strip().split(',')
            values[cur_dict][key] = int(value)

Sheesh这可能是我写过的最丑陋的代码,但values现在是{"sample1": {"a":12, "b":10, "c":4, "d":6}, "sample2": ...}

现在进行操纵。这实际上很容易。让我们将文件写入其中,因为这一步是相当基本的

for dataset in values:
    for key, value in mapping.items():
        values[dataset][key] += value
    with open(dataset + ".txt") as out:
        out.write(dataset)
        for key,value in values[dataset]:
            out.write("{},{}\n".format(key,value))