在python中,我如何用空字段解析表?
例如,我想打印此表中的子名称列表(我将其作为txt文件获取)
NAME CHILDREN_NAME PHONE
A A1 11
A2 22
A3
B B1
B2 33
C C1 44
C2
问题是我在表中有空白字段,我不知道如何将表拆分为chuncks所以正确的childrenname将引用正确的名称
结果应为
"A" children names: A1, A2, A3
"B" children names: B1, B2
"C" children names: C1, C2
我尝试使用split()或将其转换为csv文件,但它没有帮助 任何想法?
答案 0 :(得分:1)
通过使用csv和regex,您可以获得以下结果:
import csv
import re
with open('input.txt','r') as f:
reader = csv.reader(f)
thelist = list(reader)
print [re.split('\s+', line[0])[1] for line in thelist][1:] # the CHILDREN_NAME column
输出:
['A1', 'A2', 'A3', 'B1', 'B2', 'C1', 'C2']
[更新]以下方法可满足您的第二个要求:
import csv
import re
from collections import OrderedDict
with open('input.txt','r') as f:
reader = csv.reader(f)
thelist = list(reader)
result = OrderedDict()
parsed = [(re.split('\s+', line[0]))[:2] for line in thelist][1:]
for x, y in parsed:
if x:
temp = x
result[x] = y
else:
result[temp] = ','.join([result[temp], y])
print result.items()
输出:
[('A', 'A1,A2,A3'), ('B', 'B1,B2'), ('C', 'C1,C2')]