Question

我这里有一点问题。我有一个包含表格行的txt文件（比如第1行）：

id1-a1-b1-c1

我想在数据框中使用pandas加载它，索引为id，列名为'A'，'B'，'C'，值为ai，bi，ci

最后我希望数据框看起来像：

    'A'   'B'  'C'
id1  a1    b1   c1
id2  a2    b2   c2
...   ...   ...  ...

我可能想要阅读文件中的块很大但我们假设我立即阅读：

with open('file.txt') as f:
    table = pd.read_table(f, sep='-', index_col=0, header=None,   lineterminator='\n')

并重命名列

table.columns = ['A','B','C']

我当前的输出类似于：

    'A'   'B'  'C'
0
id1  a1    b1   c1
id2  a2    b2   c2
...   ...   ...  ...

还有一行我无法解释

谢谢

修改

当我尝试添加字段时

chunksize=20

之后：

for chunk in table:
    print(chunk)

我收到以下错误：

pandas.parser.CParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine='python'.

Answer 1

如果您在阅读文件之前知道列名称，请使用read_table的names参数传递列表：

with open('file.txt') as f:
    table = pd.read_table(f, sep='-', index_col=0, header=None, names=['A','B','C'],
                          lineterminator='\n')

哪个输出：

      A   B   C
id1  a1  b1  c1
id2  a2  b2  c2

Pandas read_table使用第一列作为索引

1 个答案: