在列表列表中反规范化层次结构

时间:2018-06-01 05:52:43

标签: python python-2.7

我正在解析一个文件,其中标签定义如下,层次结构用新行表示

+--------------------+--------------------+--------------------+
| L1 - A             |                    |                    |
|                    |  L2 - B            |                    |
|                    |                    |  L3 - C            |
|                    |                    |                    |
| L1 - D             |                    |                    |
|                    |  L2 - E            |                    |
|                    |                    |  L3 - F            |
+--------------------+--------------------+--------------------+

我将上述内容表示为:

labels = [
   ['A', None, None, None, 'D', None, None],
   [None, 'B', None, None, None, 'E', None],
   [None, None, 'C', None, None, None, 'F']
]

我试过

def joinfoo(items):
   if len(items) == 1:
      return items[0]

   result = []
   active = None
   for x, y in zip(items[0], joinfoo(items[1:])):
      active = x if x else active
      if type(y) is tuple:
         result.append((active, y[0], y[1]))
      else:
         result.append((active, y))

   return result

我想要

[
   ('A', None, None), ('A', 'B', None), ('A', 'B', 'C'),
   (None, None, None),
   ('D', None, None), ('D', 'E', None), ('D', 'E', 'F')
]

并得到了这个

[
   ('A', None, None), ('A', 'B', None), ('A', 'B', 'C'),
   ('A', 'B', None),
   ('D', 'B', None), ('D', 'E', None), ('D', 'E', 'F')
]

有关如何修复joinfoo()以达到预期效果的建议?解决方案需要支持可变数量的列。

应该像for x, y in zip(joinfoo(items[:-1]), items[-1]):而不是for x, y in zip(items[0], joinfoo(items[1:])):那样朝着正确的方向前进......?

编辑: 原始列表列表可能错误地暗示了层次结构的模式。没有定义的模式。列数也是可变的。一个更好的测试案例可能......

+--------------+--------------+--------------+
|   L1 - A     |              |              |    = A
|              |    L2 - B    |              |    = A - B
|              |              |    L3 - C    |    = A - B - C
|              |              |    L3 - D    |    = A - B - D
|              |    L2 - E    |              |    = A - E
|              |              |              |    =   
|   L1 - F     |              |              |    = F
|              |    L2 - G    |              |    = F - G
|              |              |    L3 - H    |    = F - G - H
+--------------+--------------+--------------+

labels = [
   ['A', None, None, None, None, None, 'F', None, None],
   [None, 'B', None, None, 'E', None, None, 'G', None],
   [None, None, 'C', 'D', None, None, None, None, 'H']
]

3 个答案:

答案 0 :(得分:2)

有一些时间在我手边,想知道我将如何解决这个问题。

所以这是我的解决方案,也许它激发了一些想法:

labels = """\
+--------------------+--------------------+--------------------+
| L1 - A             |                    |                    |
|                    |  L2 - B            |                    |
|                    |                    |  L3 - C            |
|                    |                    |                    |
| L1 - D             |                    |                    |
|                    |  L2 - E            |                    |
|                    |                    |  L3 - F            |
+--------------------+--------------------+--------------------+
"""

lines = [[(s.strip()[-1:] if s.strip() else None)
             for s in line[1:-1].split('|')]
                 for line in labels.splitlines()[1:-1]]

for index, labels in enumerate(lines):
    if not any(labels):
        continue
    for i, label in enumerate(labels):
        if label:
            break
        if not label:
            lines[index][i] = lines[index-1][i]

print([tuple(labels) for labels in lines])

# --> [('A', None, None), ('A', 'B', None), ('A', 'B', 'C'), (None, None, None), ('D', None, None), ('D', 'E', None), ('D', 'E', 'F')]

答案 1 :(得分:1)

如果x为None,则从此行

active = x if x else active保持原始值为active,但是,检查所需的输出,如果达到元组的计数,则需要一种方法将active重置为None。 / p>

这是我如何实现您想要的输出

def joinfoo(items):
   if len(items) == 1:
      return items[0]

   result = []
   active_counter=0
   count=0
   active = None
   for x, y in zip(items[0], joinfoo(items[1:])):
      count=len(y) if type(y) is tuple else 0
      if active_counter >count:
          active_counter=0
          active=None
      else:
          active_counter +=1

      active = x if x else active
      if type(y) is tuple:
         result.append((active, y[0], y[1]))
      else:
         result.append((active, y))

   return result

我得到了输出

    [('A', None, None), ('A', 'B', None), ('A', 'B', 'C'), 
(None, None, None), 
('D', None, None), ('D', 'E', None), ('D', 'E', 'F')]

希望它能解决你的问题

答案 2 :(得分:0)

这是一个提供你想要的joinfoo版本:

def empty(item):  # added this function
   if item is None:
      return True
   else:
      return not any(item)


def joinfoo(items):
   if len(items) == 1:
      return items[0]

   result = []
   active = None
   y_last = None  # added this
   for x, y in zip(items[0], joinfoo(items[1:])):
      active = x if x else active
      if not empty(y_last) and empty(y):  # added this if statement
         active = None
      y_last = y  # added this
      if type(y) is tuple:
         result.append((active, y[0], y[1]))
      else:
         result.append((active, y))

   return result

每次y条目切换回None时,你想要"激活"切换回无。

顺便说一下,因为它写的joinfoo不能用于加入任何超过3个列表。如果你需要它,

result.append((active, y[0], y[1]))替换为result.append((active, *y))