这是我在这里的第一篇文章,很抱歉,如果我做错了什么,我会尽力解释。我有两个文件,一个是以下列格式命名为 text1.txt 的csv / txt文件:
"13:02",10
"13:03",30
"13:04",15
"13:05",12
"13:06",3
...以及另一个名为 console1.txt 的(纯文本)文件,其中包含以下内容:
Rate limit: 5 at Thu Jun 12 13:02:00 PDT 2014 (Total missed: 5)
Rate limit: 10 at Thu Jun 12 13:02:01 PDT 2014 (Total missed: 15)
Rate limit: 17 at Thu Jun 12 13:02:06 PDT 2014 (Total missed: 32)
Rate limit: 10 at Thu Jun 12 13:05:50 PDT 2014 (Total missed: 42)
Rate limit: 14 at Thu Jun 12 13:05:53 PDT 2014 (Total missed: 56)
Rate limit: 84 at Thu Jun 12 13:05:21 PDT 2014 (Total missed: 140)
Rate limit: 2 at Thu Jun 12 13:06:30 PDT 2014 (Total missed: 142)
Rate limit: 5 at Thu Jun 12 13:06:34 PDT 2014 (Total missed: 147)
我想总结一下这些数字来得到总数"费率有限"每分钟,然后将这些添加到第一个csv / txt文件中的相应行。因此,预期结果将如下所示:
"13:02",42
"13:03",30
"13:04",15
"13:05",120
"13:06",10
时间戳以 13:02 开头的行上的数字(所以,5 + 10 + 17 =总共32)得到总结并添加到" 13:02"列(32 +原始10 = 42),以 13:05 开头,被添加到" 13:05"列,等等。
我不确定如何处理处理数据 - 即总结每分钟的数字。弄清楚如何从console.txt获取数据,如
"13:02",32
"13:05",108
"13:06",7
会有所帮助,从那里我可以弄清楚如何将它们添加到相应的csv行。
谢谢!
通过这个过程思考,这是我的步骤(用大括号中的伪代码):
我们说这是 console.txt :
Rate limit: 5 at Thu Jun 12 13:02:00 PDT 2014 (Total missed: 5)
Rate limit: 10 at Thu Jun 12 13:02:01 PDT 2014 (Total missed: 15)
Rate limit: 5 at Thu Jun 12 13:06:34 PDT 2014 (Total missed: 20)
1)阅读&切断所有不必要的数据
temp = open("console.txt").read()
temp = temp2.replace("Rate limit: ", "")
temp = temp2.replace(" at Thu Jun 12 ", ",")
{{ Remove the text between "PDT 2014 (" and ")" including both of those string, i.e. cut off everything after the seconds marker starting at "PDT" – this I can do myself }}
{{ Cut off the seconds of each minute – *stuck here* }}
2)格式化
{{ Add quotes around the times and reverse the two columns – can figure this out }}
这会让我:
"13:02",5
"13:02",10
"13:06",5
3)保存到新文件
file = open("file.txt", 'w')
file.write(temp)
file.close()
我可以想出从这一点开始将数字添加到类似的csv文件中。
答案 0 :(得分:1)
简单示例(不读取和写入文件):
csv = '''"13:02",10
"13:03",30
"13:04",15
"13:05",12
"13:06",3'''
rates = '''Rate limit: 5 at Thu Jun 12 13:02:00 PDT 2014 (Total missed: 5)
Rate limit: 10 at Thu Jun 12 13:02:01 PDT 2014 (Total missed: 15)
Rate limit: 17 at Thu Jun 12 13:02:06 PDT 2014 (Total missed: 32)
Rate limit: 10 at Thu Jun 12 13:05:50 PDT 2014 (Total missed: 42)
Rate limit: 14 at Thu Jun 12 13:05:53 PDT 2014 (Total missed: 56)
Rate limit: 84 at Thu Jun 12 13:05:21 PDT 2014 (Total missed: 140)
Rate limit: 2 at Thu Jun 12 13:06:30 PDT 2014 (Total missed: 142)
Rate limit: 5 at Thu Jun 12 13:06:34 PDT 2014 (Total missed: 147)'''
# --- example code ---
import re
all_times = {}
# change csv into dict
for x in csv.splitlines():
time, value = x.split(',')
all_times[time] = int(value)
# print dict
print '--- old ---'
for k,v in all_times.items():
print k, v
# add rates to dict
for x in rates.splitlines():
value, time = re.findall('Rate limit: (\d+) .* (\d+:\d+):', x)[0]
all_times['"%s"' % time] += int(value)
# print dict
print '--- new ---'
for k,v in all_times.items():
print k, v
结果:
--- old ---
"13:04" 15
"13:05" 12
"13:02" 10
"13:03" 30
"13:06" 3
--- new ---
"13:04" 15
"13:05" 120
"13:02" 42
"13:03" 30
"13:06" 10