Question

我有一个文件＆＃39; purchases.txt＆＃39;它有一个重复的城市沃思堡。我想根据城市名称将每个城市的每一行写入输出文件（例如Fort Worth.txt）

purchases.txt

2012-01-01  09:00   San Jose    Men's Clothing  214.05  Amex
2012-01-01  09:00   Fort Worth  Women's Clothing    153.57  Visa
2012-01-01  09:00   San Diego   Music   66.08   Cash
2012-01-01  09:00   Pittsburgh  Pet Supplies    493.51  Discover
2012-01-01  09:00   Omaha   Children's Clothing 235.63  MasterCard
2012-01-01  09:00   Stockton    Men's Clothing  247.18  MasterCard
2012-01-01  09:00   Austin  Cameras 379.6   Visa
2012-01-01  09:00   New York    Consumer Electronics    296.8   Cash
2012-01-01  09:00   Corpus Christi  Toys    25.38   Discover
2012-01-01  09:00   Fort Worth  Toys    213.88  Visa

主要代码

with open('purchases.txt') as input:    

      for line in input:

         city=line.split('\t')[2]

         with open('%s.txt' %city, 'w') as output:

            output.write(line)

它几乎完美，但为什么Fort Worth.txt只有1行文本而不是2行？我该如何解决这个问题？

我的Fort Worth.txt

2012-01-01  09:00   Fort Worth  Toys    213.88  Visa

渴望Fort Worth.txt

2012-01-01  09:00   Fort Worth  Women's Clothing    153.57  Visa
2012-01-01  09:00   Fort Worth  Toys    213.88  Visa

Answer 1

这是因为您在尝试编写文件时使用“w”。使用“a”将内容附加到文件

更改

with open('%s.txt' %city, 'w') as output:

到

with open('%s.txt' %city, 'a') as output:

Answer 2

您始终可以先在defaultdict收集所有信息：

from collections import defaultdict

outputs = defaultdict(list)
with open('purchases.txt') as input:    

      for line in input:

         city = line.split('\t')[2]
         outputs[city].append(line)

for city,lines in outputs.items()
     with open('%s.txt' %city, 'w') as output:
         for line in lines:
            output.write(line)

现在你打开一次文件并一次性写下所有收集的行。

这确实假设文件purchases.txt不是太大而其内容将溢出主内存。

没有写足够的行来输出文件

2 个答案: