根据常见项目减少python中的列表

时间:2018-02-06 09:40:17

标签: python

我想做出这样的转变:

['test.smth.test', 'test.smth'] -> ['test.smth']
['test.smth.test', 'test.smth.another'] -> ['test.smth.test', 'test.smth.another']
['test.one', 'test.smth'] -> ['test.one', 'test.smth']
['test.one', 'test', 'test.smth.name'] -> ['test']
['test_another.one.name', 'test', 'test.smth.name'] -> ['test', 'test_another.one.name']

我最终得到了这段代码:

def format_fields(fields):
    fields_data = defaultdict(list)
    for field in fields:
        split = field.split('.')
        base = split[0]
        already = False
        for i in reversed(range(len(split))):
            if split[:i] in fields_data[base]:
                already = True
                break
        if already:
            continue
        current = [i for i in fields_data[base] if len(i) < len(split)
                   or i[len(split) - 1] != split[-1]]
        fields_data[base] = current + [split]
    return ['.'.join(value) for group in fields_data.values() for value in group]

它似乎有效,但是有更可读/更聪明的解决方案,还是可以做到这一点的第三方库?

1 个答案:

答案 0 :(得分:1)

这应该可行,基本上你需要找到任何其他字段中没有包含的每个字段,在每个字段的末尾添加一个点,除了子字符错误,例如&#39; test_another&#39;并且&#39;测试&#39;:

cases = [
  ['test.smth.test', 'test.smth'],
  ['test.smth.test', 'test.smth.another'],
  ['test.one', 'test.smth'],
  ['test.one', 'test', 'test.smth.name'],
  ['test_another.one.name', 'test', 'test.smth.name']
]

def filterFields(fields):
  cFields = [field + '.' for field in fields]
  return [field[:-1] for index, field in enumerate(cFields) if all(field.find(f) != 0 for f in cFields[:index] + cFields[index+1:])]

for case in cases:
  print(case, '->', filterFields(case))

WORKING CODE