如何为python中的数据构建时间序列的嵌套Dictionary类型数据结构?

时间:2014-08-20 23:56:47

标签: python-2.7 dictionary

我正在尝试为以下输出创建嵌套字典或类似结构:

2014-08-19 23 positive
2014-08-19 23 neutral
2014-08-19 23 positive
2014-08-19 23 bot
2014-08-19 23 positive
2014-08-19 23 positive
2014-08-19 23 bot
2014-08-19 23 positive
2014-08-19 24 positive
2014-08-19 24 positive
2014-08-19 24 bot
2014-08-19 24 positive
2014-08-20 07 positive
2014-08-20 07 positive
2014-08-20 07 positive
2014-08-20 07 bot
2014-08-20 07 positive
2014-08-20 07 neutral
2014-08-20 08 neutral
2014-08-20 08 positive
2014-08-20 08 bot
2014-08-20 08 positive
2014-08-20 08 positive
2014-08-20 08 positive
2014-08-20 08 bot
2014-08-20 08 positive

理想情况下,我希望输出类似于以下内容:

2014-08-19:{
            23:{
                positive:5,neutral:1,bot:1}
            24:{
                positive:3, neutral:0,bot:1}}
2014-08-20: {
            07:{
                positive:4,neutral:1,bot:1}
            08:{
                positive:5, neutral:1,bot:2}}

等等。以下是我到目前为止:

collect_tweet={}

for line in open('time_short.txt'):
    line=line.strip().split(' ')

    if line[0] not in collect_tweet:
        collect_tweet[line[0]]= {}
        if line[1] not in collect_tweet[line[0]]:
            collect_tweet[line[0]][line[1]]=[]

    collect_tweet[line[0]][line[1]].append(line[2])

要实现这一目标的任何想法或建议吗?

1 个答案:

答案 0 :(得分:1)

你真的很亲密;这应该实现你想要的:

collect_tweet = {}

with open('time_short.txt') as file:
        for line in file.readlines():
                vals = line.rstrip().split()
                if vals[0] not in collect_tweet:
                        collect_tweet[vals[0]] = {}
                if vals[1] not in collect_tweet[vals[0]]:
                        collect_tweet[vals[0]][vals[1]] = {}
                if vals[2] not in collect_tweet[vals[0]][vals[1]]:
                        collect_tweet[vals[0]][vals[1]][vals[2]] = 1
                else:
                        collect_tweet[vals[0]][vals[1]][vals[2]] += 1

print collect_tweet