按月过滤datetime对象中的日期

时间:2017-10-09 20:07:32

标签: python dictionary while-loop

我有一个字典,其中的键是datetime.datetime&值是推文列表。所以它看起来像这样:

{datetime.datetime(2017, 9, 30, 19, 55, 20) : ['this is some tweet text'],
 datetime.datetime(2017, 9, 30, 19, 55, 20) : ['this is another tweet']...

我试图获得一年中每个月发送的推文数量。到目前为止,我有......

startDate = 10
endDate= 11
start = True
while start:

    for k,v in tweetDict.items():
        endDate-=1
        startDate-=1

        datetimeStart = datetime(2017, startDate, 1)
        datetimeEnd = datetime(2017,endDate, 1)

        print(datetimeStart, datetimeEnd)

        if datetimeStart < k < datetimeEnd:
            print(v)
        if endDate == 2:
            start = False
            break

仅打印(我知道print语句)......

2017-08-01 00:00:00 2017-09-01 00:00:00
2017-07-01 00:00:00 2017-08-01 00:00:00
2017-06-01 00:00:00 2017-07-01 00:00:00
2017-05-01 00:00:00 2017-06-01 00:00:00
2017-04-01 00:00:00 2017-05-01 00:00:00
2017-03-01 00:00:00 2017-04-01 00:00:00
2017-02-01 00:00:00 2017-03-01 00:00:00
2017-01-01 00:00:00 2017-02-01 00:00:00

而不是实际的推文本身。我期待的东西就像......

2017-08-01 00:00:00 2017-09-01 00:00:00
['heres a tweet']
['theres a tweet']
2017-07-01 00:00:00 2017-08-01 00:00:00
['there only 1 tweet for this month']....

我有点卡住了,我怎么能实现这个目标?

2 个答案:

答案 0 :(得分:1)

您可以group by一个月而不是尝试减去/比较不同的月份:

>>> d = {datetime.datetime(2017, 9, 30, 19, 55, 20): ['this is some tweet text'],
         datetime.datetime(2017, 9, 30, 20, 55, 20): ['this is another tweet'],
         datetime.datetime(2017, 10, 30, 19, 55, 20): ['this is an october tweet'],}
>>> from itertools import groupby
>>> for month, group in groupby(d.items(), lambda (k, v): k.month):
...     print(month)
...     for dt, tweet in group:
...         print(dt, tweet)
...         
10
2017-10-30 19:55:20 ['this is an october tweet']
9
2017-09-30 19:55:20 ['this is some tweet text']
2017-09-30 20:55:20 ['this is another tweet']
>>> 

当然,你可以用更好的格式打印它等等(需要内连接,因为每个键似乎都是一个列表):

>>> for month, group in groupby(d.items(), lambda (k, v): k.month):
...     tweets = list(group)
...     print("%d tweet(s) in month %d" % (len(tweets), month))
...     print('\n'.join(','.join(tweet) for (dt, tweet) in tweets))
...     
1 tweet(s) in month 10
this is an october tweet
2 tweet(s) in month 9
this is some tweet text
this is another tweet
>>> 

答案 1 :(得分:0)

第一件事:你用完全相同的钥匙在你的dict中放两件物品。第二个将覆盖第一个。对于其余部分,我将假设您示例中的第二项略有不同(seconds=21)。

您的代码没有按预期工作的原因是因为您在endDate循环内递减了startDatefor。因此,您只需针对字典中的单个项目检查每个日期;如果该项目恰好在该月登陆,则会打印出来。如果没有,它就没有了。为了说明这里,如果您将print更改为print(datetimeStart, datetimeEnd, k, v),您会得到什么:

2017-09-01 00:00:00 2017-10-01 00:00:00 2017-09-30 19:55:20 ['this is some tweet text']
['this is some tweet text']
2017-08-01 00:00:00 2017-09-01 00:00:00 2017-09-30 19:55:21 ['this is another tweet']
2017-07-01 00:00:00 2017-08-01 00:00:00 2017-09-30 19:55:20 ['this is some tweet text']
2017-06-01 00:00:00 2017-07-01 00:00:00 2017-09-30 19:55:21 ['this is another tweet']
2017-05-01 00:00:00 2017-06-01 00:00:00 2017-09-30 19:55:20 ['this is some tweet text']
2017-04-01 00:00:00 2017-05-01 00:00:00 2017-09-30 19:55:21 ['this is another tweet']
2017-03-01 00:00:00 2017-04-01 00:00:00 2017-09-30 19:55:20 ['this is some tweet text']
2017-02-01 00:00:00 2017-03-01 00:00:00 2017-09-30 19:55:21 ['this is another tweet']
2017-01-01 00:00:00 2017-02-01 00:00:00 2017-09-30 19:55:20 ['this is some tweet text']

对现有代码进行最少更改的修复方法是简单地将for循环前面的减量移动到if endDate...块的前面,到while循环的级别:

while start:
    endDate-=1
    startDate-=1
    for k,v in tweetDict.items():
        datetimeStart = datetime(2017, startDate, 1)
        datetimeEnd = datetime(2017,endDate, 1)
        print(datetimeStart, datetimeEnd, k, v)
        if datetimeStart < k < datetimeEnd:
            print(v)
    if endDate == 2:
        start = False
        break

当然,此时您可能只是摆脱if endDate...阻止并执行while endDate > 2: