使用模板后面的空格拆分字符串

时间:2015-10-22 03:47:21

标签: python regex string strip

我有以下字符串标题(模板):

Port          Name               Status    Vlan      Duplex  Speed   Type

和字符串 str

Eth1/2        trunk to dg-qwu-29 connected trunk     full    1000    1/10g

使用标题,如何将str删除到以下列表?

[Eth1/2, trunk to dg-qwu-29, connected, trunk, full, 1000, 1/10g]

2 个答案:

答案 0 :(得分:1)

以下假设行和标题遵循空白掩码。也就是说,标题文本与行列对齐。

import re
header =  "Port          Name               Status    Vlan      Duplex  Speed   Type"
row    =  "Eth1/2        trunk to dg-qwu-29 connected trunk     full    1000    1/10g"
# retrieve indices where each header title begins and ends
matches = [(m.group(0), (m.start(), m.end()-1)) for m in re.finditer(r'\S+', header)]
b,c=zip(*matches)
# each text in the row begins in each header title index and ends at most before the index 
# of the next header title. strip() to remove extra spaces
items = [(row[j[0]:(c[i+1][0] if i < len(c)-1 else len(row))]).strip() for i,j in enumerate(c)]
print items

以上输出:

['Eth1/2', 'trunk to dg-qwu-29', 'connected', 'trunk', 'full', '1000', '1/10g']

修改:从https://stackoverflow.com/a/13734572/1847471

检索索引

答案 1 :(得分:0)

您没有提供有关列值如何格式化的信息,即分隔符,转义字符和字符串引号。根据您的示例,我会说,棘手的部分是名称列,您将不得不提取每个排除。这是一种快速的方法,您可以从那里开始:

# Get the first element
first = str.split()[0]
# Get the last part of the string, excluding name 
last = str.split()[::-1][0:5]
last = last[::-1]
# Get the name column via exclusion of the two parts previously calculated
middle = str.split(first)[1].split(last[0])[0].strip()
r_tmp = [first, middle]
result = r_tmp + last
print result