我有以下两个CSV文件:
CSV文件1:
Range1,2018-05-17 01:50:17+0000,2018-05-17 02:00:17+0000
Range2,2018-05-17 01:50:17+0000,2018-05-17 04:00:17+0000
Range3,2018-05-17 01:50:17+0000,2018-05-17 08:00:17+0000
CSV文件2:
TimeStamp1,2018-05-17 01:59:17+0000
TimeStamp2,2018-05-17 03:59:17+0000
TimeStamp3,2018-05-17 07:59:17+0000
我想通过File1中的每个Range进行迭代,并确定哪个TimeStamp属于要比较的Range。例如。我的Python脚本的输出将显示:
输出:
TimeStamp1 falls within Range1
TimeStamp1, TimeStamp2 falls within Range2
TimeStamp1, TimeStamp2, TimeStamp3 falls within Range3
我开始编写类似这样的东西,但是在获取输出和if语句时最初通过File1与File2中的所有行正确迭代,然后重复File1中的下一行重复File2中的所有行。提前谢谢。
import csv
with open('File1', 'rb') as range, open('File2', 'rb') as timeStamp:
range_reader = csv.reader(range, quotechar='"')
timeStamp_reader = csv.reader(timeStamp, quotechar='"')
for range_row in range_reader:
print range_row[2]
print range_row[3]
for timeStamp_row in timeStamp_reader:
print timeStamp_row[2]
if range_row[2] <= timeStamp_row[2] and range_row[3] >= timeStamp_row[2]
print " %s falls within %s "% (timeStamp_row[1], range_row[1])
答案 0 :(得分:1)
您的代码中几乎没有错误。首先,你已经把索引搞砸了。索引从0开始。所以只需从所有索引中减去1。
你不能反复阅读文件,因为读者会看到它结束,然后再也不会读取任何东西,因为它最后会被读取。因此,对于第二个循环,您需要重新启动它的读卡器。通过设置搜索可以轻松完成。
import csv
with open('File1', 'r') as ranges, open('File2', 'r') as timeStamp:
range_reader = csv.reader(ranges, quotechar='"')
timeStamp_reader = csv.reader(timeStamp, quotechar='"')
rangeArray = {}
for range_row in range_reader:
print("%s / %s" % ( range_row[1], range_row[2])) # This looks better, and gives more info than just printing both timestamps on each line
timeStamp.seek(0) # This will set position of cursor in timeStamp back to start, so it can iterate repeatedly
rangeArray[range_row[0]] = []
for timeStamp_row in timeStamp_reader:
if range_row[1] <= timeStamp_row[1] and range_row[2] >= timeStamp_row[1]:
rangeArray[range_row[0]].append(timeStamp_row[0])
print (" %s falls within %s " % (timeStamp_row[0], range_row[0]))
print("\n\n")
# Desired Output:
for key in rangeArray:
print("%s falls within %s" % (', '.join([str(x) for x in rangeArray[key]]), key))
这样输出如下:
2018-05-17 01:50:17+0000 / 2018-05-17 02:00:17+0000
TimeStamp1 falls within Range1
2018-05-17 01:50:17+0000 / 2018-05-17 04:00:17+0000
TimeStamp1 falls within Range2
TimeStamp2 falls within Range2
2018-05-17 01:50:17+0000 / 2018-05-17 08:00:17+0000
TimeStamp1 falls within Range3
TimeStamp2 falls within Range3
TimeStamp3 falls within Range3
TimeStamp1 falls within Range1
TimeStamp1, TimeStamp2 falls within Range2
TimeStamp1, TimeStamp2, TimeStamp3 falls within Range3
答案 1 :(得分:1)
import csv
with open('File1.csv', 'rb') as ranger, open('File2.csv', 'rb') as timeStamp:
range_reader = [x for x in csv.reader(ranger, quotechar='"')]
timeStamp_reader = [x for x in csv.reader(timeStamp, quotechar='"')]
for range_row in range_reader:
temp = []
for timeStamp_row in timeStamp_reader:
if range_row[1] <= timeStamp_row[1] and range_row[2] >= timeStamp_row[1]:
temp.append(timeStamp_row[0])
if temp:
print " %s falls within %s "% (','.join(temp), range_row[0])
Lukasas ans很好,但是如果你的数据集很大,每次寻找for循环可能不是一个好主意。 只需在开头复制它们即可。 此外,要根据需要进行输出,需要将它们保存在外循环的开头。
TimeStamp1 falls within Range1
TimeStamp1,TimeStamp2 falls within Range2
TimeStamp1,TimeStamp2,TimeStamp3 falls within Range3
答案 2 :(得分:1)
正如您将看到的,我做了很多改动,从我在Python 3中编写代码开始。您使用的是Python 2吗?
无论如何,很高兴回答问题。我认为这主要是你想要它的方式:
import csv
import datetime
with open('File1', 'r') as range, open('File2', 'r') as timeStamp:
range_rows = list(csv.reader(range, quotechar='"'))
timeStamp_rows = list(csv.reader(timeStamp, quotechar='"'))
range_list = []
d=datetime.datetime.now()
for row in range_rows:
time = [row[0], d.strptime(row[1][:-5],"%Y-%m-%d %H:%M:%S"), d.strptime(row[2][:-5],"%Y-%m-%d %H:%M:%S")]
range_list.append(time)
timeStamp_list = []
for row in timeStamp_rows:
time = [row[0], d.strptime(row[1][:-5],"%Y-%m-%d %H:%M:%S")]
timeStamp_list.append(time)
for i in range_list:
for e in timeStamp_list:
if i[1] <= e[1] and i[2] >= e[1]:
print(" %s falls within %s "% (e[0], i[0]))
输出:
TimeStamp1 falls within Range1
TimeStamp1 falls within Range2
TimeStamp2 falls within Range2
TimeStamp1 falls within Range3
TimeStamp2 falls within Range3