通过条件迭代字典列表

时间:2017-04-14 22:58:32

标签: python json list loops dictionary

假设测试是一个很大的字典列表(这只是一个示例):

  test = [
{'alignedWord': 'welcome',
  'case': 'success',
  'end': 0.9400000000000001,
  'start': 0.56
  'word': 'Welcome'},

 {'alignedWord': 'to',
  'case': 'success',
  'end': 1.01,
  'start': 0.94,
  'word': 'to'},

 {'alignedWord': 'story',
  'case': 'not-found-in-audio',
  'word': 'Story'},

 {'alignedWord': 'in',
  'case': 'success',
  'end': 1.4100000000000001,
  'start': 1.34,
  'word': 'in'},

 {'alignedWord': 'a',
  'case': 'success',
  'end': 1.44,
  'start': 1.41,
  'word': 'a'},

 {'alignedWord': 'bottle',
  'case': 'success',
  'end': 1.78,
  'start': 1.44,
  'word': 'Bottle'} ]

输出作为`case ==' success'的每个连续块的json文件。和duration_s< 10:

Output:

{"text": "Welcome to", "duration_s": 0.45}
{"text": "in a bottle", "duration_s': 0.44}
  

duration = ('end' - 'start') #of the text

2 个答案:

答案 0 :(得分:0)

我在测试列表中间添加了一个没有@ConfigurationProperties(prefix = "test") public class ConfigProps { private String name; public String getName() { return name; } } @Configuration @EnableConfigurationProperties public class AppConfig { @Autowired public AppConfig(ConfigProps configProps) { if (!"test".equals(configProps.getName()) { throw new IllegalArugmentException("name not correct value"); } } } start键的新词典,这对您现在有用吗?当你澄清时,我也改变了滚动的持续时间。

end
  

{" text":"欢迎来到","持续时间":0.45}

     

{" text":" in a Bottle"," duration":0.44}

答案 1 :(得分:0)

这可以使用生成最终字典的生成器来解决:

def split(it):
    it = iter(it)
    acc, duration = [], 0  # defaults
    for item in it:
        if item['case'] != 'success':   # split when there's a non-success
            if acc:
                yield {'text': ' '.join(acc), 'duration': duration}
                acc, duration = [], 0  # reset defaults

        else:
            tmp_duration = item['end'] - item['start']

            if tmp_duration + duration >= 10:  # split when the duration is too long
                if acc:
                    yield {'text': ' '.join(acc), 'duration': duration}
                acc, duration = [item['word']], tmp_duration  # new defaults

            else:
                acc.append(item['word'])
                duration += tmp_duration

    if acc:  # give the remaining items
        yield {'text': ' '.join(acc), 'duration': duration}

一个简单的测试给出:

>>> list(split(test))
[{'duration': 0.45000000000000007, 'text': 'Welcome to'},
 {'duration': 0.44000000000000017, 'text': 'in a Bottle'}]

这可以很容易地转储到JSON文件中:

>>> import json
>>> json.dumps(list(split(test)))
'[{"text": "Welcome to", "duration": 0.45000000000000007}, {"text": "in a Bottle", "duration": 0.44000000000000017}]'