Question

我有一些以下格式的配置数据。在python中解析这些数据的最佳方法是什么？我检查了csv模块和简要this模块。无法弄清楚如何使用它。现有的解析器在perl中被黑客攻击。

|------------+-----------------+--------|
| ColHead1   | Col_______Head2 | CH3    |
|------------+-----------------+--------|
| abcdefg000 | *               | somev1 |
| abcdefg001 | *               | somev2 |
| abcdefg002 | *               |        |
| abcdefg003 | *               |        |
| abcdefg004 | *               |        |
| abcdefg005 | *               |        |
| abcdefg006 | *               |        |
| abcdefg007 | *               |        |
| abcdefg008 | *               |        |
| abcdefg009 | *               |        |
| abcdefg010 | *               |        |
|------------+-----------------+--------|

Answer 1

你可以尝试这样的事情：

def parse(ascii_table):
    header = []
    data = []
    for line in filter(None, ascii_table.split('\n')):
        if '-+-' in line:
            continue
        if not header:
            header = filter(lambda x: x!='|', line.split())
            continue
        data.append(['']*len(header))
        splitted_line = filter(lambda x: x!='|', line.split())
        for i in range(len(splitted_line)):
            data[-1][i]=splitted_line[i]
    return header, data

Answer 2

这是另一种（类似的）方法如果它在文件中：

with open(filepath) as f:
    for line in f:
        if '-+-' in line or 'Head' in line:
            continue
        # strip '|' off the ends then split on '|'
        c1, c2, c3 =  line.strip('|').split('|')
        print 'Col1: {}\tCol2: {}\tCol3: {}'.format(c1,c2,c3)

或字符串变量：

for line in ascii_table.split('\n'):
    if '-+-' in line or 'Head' in line:
        continue
    c1, c2, c3 =  line.strip('|').split('|')
    print 'Col1: {}\tCol2: {}\tCol3: {}'.format(c1,c2,c3)

如何在python中读取ascii格式的表

2 个答案: