Question

我遇到一个问题，在我查看的所有CSV帖子中都找不到任何解决方案。我有数千行的csv，其中第一列具有日期和时间戳。每2秒就有一个新的时间戳记

注释1：非常重要的注释（这引起了我的问题）是每个日期和时间都出现了几次

注释2：日期已经排序

我的前40行

30/07/2018 22:52:52,4,50,26

30/07/2018 22:52:52,7,49,26

30/07/2018 22:52:52,6,50,26

30/07/2018 22:52:52,5,51,26

30/07/2018 22:52:52,2,50,26

30/07/2018 22:52:52,3,49,26

30/07/2018 22:52:55,4,50,26

30/07/2018 22:52:55,7,49,26

30/07/2018 22:52:55,6,50,26

30/07/2018 22:52:55,5,51,26

30/07/2018 22:52:55,2,50,26

30/07/2018 22:52:55,3,49,26

30/07/2018 22:52:57,4,50,26

30/07/2018 22:52:57,7,49,26

30/07/2018 22:52:57,6,50,26

30/07/2018 22:52:57,5,51,26

30/07/2018 22:52:57,2,50,26

30/07/2018 22:52:57,3,49,26

30/07/2018 22:52:59,4,50,26

30/07/2018 22:52:59,7,49,26

30/07/2018 22:52:59,6,50,26

30/07/2018 22:52:59,5,51,26

30/07/2018 22:52:59,2,50,26

30/07/2018 22:52:59,3,49,26

30/07/2018 22:53:02,4,50,26

30/07/2018 22:53:02,7,49,26

30/07/2018 22:53:02,6,50,26

30/07/2018 22:53:02,5,51,26

30/07/2018 22:53:02,2,50,26

30/07/2018 22:53:02,3,49,26

30/07/2018 22:53:04,4,50,26

30/07/2018 22:53:04,7,49,26

30/07/2018 22:53:04,6,50,26

30/07/2018 22:53:04,5,51,26

30/07/2018 22:53:04,2,50,26

30/07/2018 22:53:04,3,49,26

30/07/2018 22:53:07,4,50,26

30/07/2018 22:53:07,7,49,26

30/07/2018 22:53:07,6,50,26

30/07/2018 22:53:07,5,51,26

30/07/2018 22:53:07,2,50,26

30/07/2018 22:53:07,3,49,26

30/07/2018 22:53:09,4,50,26

30/07/2018 22:53:09,7,49,26

30/07/2018 22:53:09,6,50,26

30/07/2018 22:53:09,5,50,26

30/07/2018 22:53:09,2,50,26

30/07/2018 22:53:09,3,49,26

我需要从用户那里获取一个输入，例如5，然后每5秒获取最后一个时间戳，并在第2列和第3列中制作字典。因此，对于输入5的病人，必须进行以下操作：

30/07/2018 22:52:59,4,50,26

30/07/2018 22:52:59,7,49,26

30/07/2018 22:52:59,6,50,26

30/07/2018 22:52:59,5,51,26

30/07/2018 22:52:59,2,50,26

30/07/2018 22:52:59,3,49,26

30/07/2018 22:53:09,7,49,26

30/07/2018 22:53:09,6,50,26

30/07/2018 22:53:09,5,50,26

30/07/2018 22:53:09,2,50,26

30/07/2018 22:53:09,3,49,26

字典应如下所示：

{timestamp：{第二列：第三列}}

{30/07/2018 22:52:59：{4：50,7：49,6：50,5：51,2：50,3：49}}

到目前为止，我所拥有的每个时间戳只能使用1次，这意味着每个时间戳都可以得到这本词典：

{30/07/2018 22:52:59：{4:50}，30/07/2018 22:53:09：{4:50}}

这是我的代码：

with open(os.path.join(inputPath,filename),"r") as f:
            dictTemp = {}
            r = csv.reader(f)
            #Gets first date from node file
            minTime = dt.strptime(next(r)[0], "%d/%m/%Y %H:%M:%S")
        #open file second time to loop through all rows
            for line in r:
                currentTime = dt.strptime(line[0], "%d/%m/%Y %H:%M:%S")
                if((currentTime-minTime).total_seconds() > 5):
                    minTime = currentTime
                    scenariotimeStamps.append((currentTime.strftime("%Y%m%d%H%M%S")))
                    dictTemp[line[1]] = line[2]
                    dicComplete[str(currentTime.strftime("%Y%m%d%H%M%S"))] = dictTemp

Answer 1

使用：

dictTemp[line[1]] = line[2]
dicComplete[str(currentTime.strftime("%Y%m%d%H%M%S"))] = dictTemp

您将在每次迭代中覆盖字典dicComplete[str(currentTime.strftime("%Y%m%d%H%M%S"))]。将两行更改为：

dictComplete.setdefault(str(currentTime.strftime("%Y%m%d%H%M%S")), {})[line[1]] = line[2]

而且，由于您要在确认距上一个时间戳至少5秒钟后获取同一时间戳的所有行，因此，您可以：

if((currentTime-minTime).total_seconds() > 5):

如果currentTime等于minTime，则应该允许它：

if currentTime == minTime or (currentTime-minTime).total_seconds() > 5:

循环浏览csv文件并根据情况获取数据

1 个答案: