我有一个字典,其中的键是datetime.datetime&值是推文列表。所以它看起来像这样:
{datetime.datetime(2017, 9, 30, 19, 55, 20) : ['this is some tweet text'],
datetime.datetime(2017, 9, 30, 19, 55, 20) : ['this is another tweet']...
我试图获得一年中每个月发送的推文数量。到目前为止,我有......
startDate = 10
endDate= 11
start = True
while start:
for k,v in tweetDict.items():
endDate-=1
startDate-=1
datetimeStart = datetime(2017, startDate, 1)
datetimeEnd = datetime(2017,endDate, 1)
print(datetimeStart, datetimeEnd)
if datetimeStart < k < datetimeEnd:
print(v)
if endDate == 2:
start = False
break
仅打印(我知道print语句)......
2017-08-01 00:00:00 2017-09-01 00:00:00
2017-07-01 00:00:00 2017-08-01 00:00:00
2017-06-01 00:00:00 2017-07-01 00:00:00
2017-05-01 00:00:00 2017-06-01 00:00:00
2017-04-01 00:00:00 2017-05-01 00:00:00
2017-03-01 00:00:00 2017-04-01 00:00:00
2017-02-01 00:00:00 2017-03-01 00:00:00
2017-01-01 00:00:00 2017-02-01 00:00:00
而不是实际的推文本身。我期待的东西就像......
2017-08-01 00:00:00 2017-09-01 00:00:00
['heres a tweet']
['theres a tweet']
2017-07-01 00:00:00 2017-08-01 00:00:00
['there only 1 tweet for this month']....
我有点卡住了,我怎么能实现这个目标?
答案 0 :(得分:1)
您可以group by一个月而不是尝试减去/比较不同的月份:
>>> d = {datetime.datetime(2017, 9, 30, 19, 55, 20): ['this is some tweet text'],
datetime.datetime(2017, 9, 30, 20, 55, 20): ['this is another tweet'],
datetime.datetime(2017, 10, 30, 19, 55, 20): ['this is an october tweet'],}
>>> from itertools import groupby
>>> for month, group in groupby(d.items(), lambda (k, v): k.month):
... print(month)
... for dt, tweet in group:
... print(dt, tweet)
...
10
2017-10-30 19:55:20 ['this is an october tweet']
9
2017-09-30 19:55:20 ['this is some tweet text']
2017-09-30 20:55:20 ['this is another tweet']
>>>
当然,你可以用更好的格式打印它等等(需要内连接,因为每个键似乎都是一个列表):
>>> for month, group in groupby(d.items(), lambda (k, v): k.month):
... tweets = list(group)
... print("%d tweet(s) in month %d" % (len(tweets), month))
... print('\n'.join(','.join(tweet) for (dt, tweet) in tweets))
...
1 tweet(s) in month 10
this is an october tweet
2 tweet(s) in month 9
this is some tweet text
this is another tweet
>>>
答案 1 :(得分:0)
第一件事:你用完全相同的钥匙在你的dict中放两件物品。第二个将覆盖第一个。对于其余部分,我将假设您示例中的第二项略有不同(seconds=21
)。
您的代码没有按预期工作的原因是因为您在endDate
循环内递减了startDate
和for
。因此,您只需针对字典中的单个项目检查每个日期;如果该项目恰好在该月登陆,则会打印出来。如果没有,它就没有了。为了说明这里,如果您将print
更改为print(datetimeStart, datetimeEnd, k, v)
,您会得到什么:
2017-09-01 00:00:00 2017-10-01 00:00:00 2017-09-30 19:55:20 ['this is some tweet text']
['this is some tweet text']
2017-08-01 00:00:00 2017-09-01 00:00:00 2017-09-30 19:55:21 ['this is another tweet']
2017-07-01 00:00:00 2017-08-01 00:00:00 2017-09-30 19:55:20 ['this is some tweet text']
2017-06-01 00:00:00 2017-07-01 00:00:00 2017-09-30 19:55:21 ['this is another tweet']
2017-05-01 00:00:00 2017-06-01 00:00:00 2017-09-30 19:55:20 ['this is some tweet text']
2017-04-01 00:00:00 2017-05-01 00:00:00 2017-09-30 19:55:21 ['this is another tweet']
2017-03-01 00:00:00 2017-04-01 00:00:00 2017-09-30 19:55:20 ['this is some tweet text']
2017-02-01 00:00:00 2017-03-01 00:00:00 2017-09-30 19:55:21 ['this is another tweet']
2017-01-01 00:00:00 2017-02-01 00:00:00 2017-09-30 19:55:20 ['this is some tweet text']
对现有代码进行最少更改的修复方法是简单地将for
循环前面的减量移动到if endDate...
块的前面,到while
循环的级别:
while start:
endDate-=1
startDate-=1
for k,v in tweetDict.items():
datetimeStart = datetime(2017, startDate, 1)
datetimeEnd = datetime(2017,endDate, 1)
print(datetimeStart, datetimeEnd, k, v)
if datetimeStart < k < datetimeEnd:
print(v)
if endDate == 2:
start = False
break
当然,此时您可能只是摆脱if endDate...
阻止并执行while endDate > 2:
。