我有一个要循环播放的文件目录(dir1),文件名采用以下格式:
20170605.000000
20170605.001000
20170605.002000
...
20170610.235000
我还有另一个目录,目录的时间很不规则(dir2),文件名的格式为:
20170604.235710
20170605.000427
20170605.093241
20170605.172221
...
20170611.000426
我想遍历dir1中的文件,然后从dir2中的文件创建一个列表,这些列表位于dir1中文件名的最后一个小时之内。例如:
20170605.000000:从20170604.230000-20170605.000000获取dir2中所有文件的列表 20170605.001000:从20170604.231000-20170605.001000获取dir2中所有文件的列表 20170605.002000:获取20170604.232000-20170605.002000中dir2中所有文件的列表
....
20170610.235000:获取dir2中所有文件的列表,从20170609.235000-20170610.235000
我已经分解了yyyy,mm,dd,hh,mm和秒的开始和结束范围,但是代码很快变得很丑陋。我知道日期时间可能会有所帮助,但该增量似乎只在几天而不是几秒钟内起作用。有一种我不知道/想到的简单方法吗?
答案 0 :(得分:1)
您可以尝试做这样的事情-
from datetime import datetime
from datetime import timedelta
dir1_file_list = ['20170605.000000', '20170605.001000', '20170605.002000']
dir2_file_list = ['20170604.235710', '20170605.000427', '20170605.093241', '20170605.172221']
dir1_file_list = [datetime.strptime(f, '%Y%m%d.%H%M%S') for f in dir1_file_list]
dir2_file_list = [datetime.strptime(f, '%Y%m%d.%H%M%S') for f in dir2_file_list]
associations = dict()
for dir1_file in dir1_file_list:
associations[str(dir1_file)] = []
for dir2_file in dir2_file_list:
if 0 <= (dir1_file - dir2_file).total_seconds() <= 3600: # One hour timeframe
associations[str(dir1_file)].append(str(dir2_file))
然后打印字典associations
以查看结果。
答案 1 :(得分:0)
IIUC,您可以使用datetime
和pd.to_datetime()
将列表转换为pd.Series()
系列,然后只需使用字典理解即可返回所需的输出:
import pandas as pd
from datetime import datetime, timedelta
dir1 = [
'20170605.000000',
'20170605.001000',
'20170605.002000',
]
dir2 = [
'20170604.235710',
'20170605.000427',
'20170605.093241',
'20170605.172221',
]
dir1 = pd.to_datetime(pd.Series(dir1), format='%Y%m%d.%H%M%S')
dir2 = pd.to_datetime(pd.Series(dir2), format='%Y%m%d.%H%M%S')
retrieved = {i: [j for j in dir2 if i-timedelta(hours=1) < j < i] for i in dir1}
返回:
{
Timestamp('2017-06-05 00:00:00'): [Timestamp('2017-06-04 23:57:10')],
Timestamp('2017-06-05 00:10:00'): [Timestamp('2017-06-04 23:57:10'), Timestamp('2017-06-05 00:04:27')],
Timestamp('2017-06-05 00:20:00'): [Timestamp('2017-06-04 23:57:10'), Timestamp('2017-06-05 00:04:27')]
}