我有一个像
这样的数据结构[ {'uid': 'test_subject145', 'class':'?', 'data':[ {'chunk':1, 'writing':[ ['this is exciting'],[ 'you are good' ]... ]} ] },
{'uid': 'test_subject166', 'class':'?', 'data':[ {'chunk':2, 'writing':[ ['he died'],[ 'go ahead' ]... ]} ] }, ...]
它是一个包含许多词典的列表,每个词典都有 3对
'uid': 'test_subject145', 'class':'?', 'data':[]
。
在最后一对 'data'
中,该值是一个列表,它再次包含一个字典,其中包含 2对 'chunk':1, 'writing':[]
#39; 撰写',其值列表再次包含多个列表。
我要提取的是所有这些句子的内容,如'this is exciting'
和'you are good'
等,然后放入一个简单的列表中。其最终形式应为 list_final = ['this is exciting', 'you are good', 'he died',... ]
答案 0 :(得分:3)
鉴于您的原始列表名为input
,只需使用list comprehension:
[elem for dic in input
for dat in dic.get('data',())
for writing in dat.get('writing',())
for elem in writing]
你可以使用.get(..,())
,如果没有这样的密钥,它仍然可以工作:如果没有这样的密钥,我们返回空元组()
,所以没有迭代。
根据您的示例输入,我们得到:
>>> input = [ {'uid': 'test_subject145', 'class':'?', 'data':[ {'chunk':1, 'writing':[ ['this is exciting'],[ 'you are good' ]]} ] },
... {'uid': 'test_subject166', 'class':'?', 'data':[ {'chunk':2, 'writing':[ ['he died'],[ 'go ahead' ] ]} ] }]
>>>
>>> [elem for dic in input
... for dat in dic.get('data',())
... for writing in dat.get('writing',())
... for elem in writing]
['this is exciting', 'you are good', 'he died', 'go ahead']
答案 1 :(得分:2)
TL;博士
[str for dic in data
for data_dict in dic['data']
for writing_sub_list in data_dict['writing']
for str in writing_sub_list]
慢慢来,一次做一层。然后重构代码以使其更小。
data = [{'class': '?',
'data': [{'chunk': 1,
'writing': [['this is exciting'], ['you are good']]}],
'uid': 'test_subject145'},
{'class': '?',
'data': [{'chunk': 2,
'writing': [['he died'], ['go ahead']]}],
'uid': 'test_subject166'}]
for d in data:
print(d)
# {'class': '?', 'uid': 'test_subject145', 'data': [{'writing': [['this is exciting'], ['you are good']], 'chunk': 1}]}
# {'class': '?', 'uid': 'test_subject166', 'data': [{'writing': [['he died'], ['go ahead']], 'chunk': 2}]}
for d in data:
data_list = d['data']
print(data_list)
# [{'writing': [['this is exciting'], ['you are good']], 'chunk': 1}]
# [{'writing': [['he died'], ['go ahead']], 'chunk': 2}]
for d in data:
data_list = d['data']
for d2 in data_list:
print(d2)
# {'writing': [['this is exciting'], ['you are good']], 'chunk': 1}
# {'writing': [['he died'], ['go ahead']], 'chunk': 2}
for d in data:
data_list = d['data']
for d2 in data_list:
writing_list = d2['writing']
print(writing_list)
# [['this is exciting'], ['you are good']]
# [['he died'], ['go ahead']]
for d in data:
data_list = d['data']
for d2 in data_list:
writing_list = d2['writing']
for writing_sub_list in writing_list:
print(writing_sub_list)
# ['this is exciting']
# ['you are good']
# ['he died']
# ['go ahead']
for d in data:
data_list = d['data']
for d2 in data_list:
writing_list = d2['writing']
for writing_sub_list in writing_list:
for str in writing_sub_list:
print(str)
# this is exciting
# you are good
# he died
# go ahead
然后转换为更小的(但难以阅读),重写上面这样的代码。应该很容易看出如何从一个到另一个:
strings = [str for d in data for d2 in d['data'] for wsl in d2['writing'] for str in wsl]
# ['this is exciting', 'you are good', 'he died', 'go ahead']
然后,用更好的名字来表达它,比如Willem的回答:
[str for dic in data
for data_dict in dic['data']
for writing_sub_list in data_dict['writing']
for str in writing_sub_list]
答案 2 :(得分:1)
所以我相信以下内容可行
;WITH CTE_DIFF AS (
SELECT [TimeStamp], [State],
DATEDIFF ( second ,
[TimeStamp] ,
LEAD([TimeStamp]) OVER (ORDER BY [TimeStamp])) AS time_diff
FROM mytable
), CTE_PERC AS (
SELECT [TimeStamp], [State], time_diff ,
SUM(time_diff) OVER (ORDER BY [TimeStamp]) * 1.0 /
SUM(time_diff) OVER () * 100 AS perc
FROM CTE_DIFF
)
SELECT [TimeStamp], [State],
COALESCE(LAG(perc) OVER (ORDER BY [TimeStamp]), 0) AS PercentageStart,
perc AS PercentageEnd
FROM CTE_PERC
如上所述,此项目我认为有助于理解 - python getting a list of value from list of dict(感谢麦格雷迪)