尝试设置pandas列名时出现“总线错误:10”

时间:2016-07-11 04:31:47

标签: python python-3.x pandas

以下是我正在执行的代码按索引重命名pandas列

import pandas as pd

df = pd.read_csv('input.csv', dtype='unicode', delim_whitespace=True)
df.columns.values[2] = "id"
print(df)

我很确定这不是最好的方法,但是当我使用Python 3.5运行时,我得到:

$ python3.5 test.py
Bus error: 10

这是我第一次看到这样的错误。没有回溯,只有这个输出字符串。

Bus error: 10是什么意思?

以下是input.csv内容:

visitIp        userId   idSite
128.227.50.161   a        35
24.222.206.154   a        35
10.12.0.1        a        35
10.12.0.1        a        35
10.12.0.1        a        35
24.222.206.154   a        35

(使用pandas 0.17.1)

1 个答案:

答案 0 :(得分:2)

Bus error occurs when a processor can't access an invalid memory address.

df.columns is an instance if Index which is an immutable object in pandas. Any operation changing it, returns in fact a new object. Modifying its elements is illegal, for example df.columns[2] = 'id' would raise an exception.

You were accessing and modifying an underlying data of the index. Actually, not the data directly but a numpy view of the data, which could have been a temporary object. (Internally, Index.values is a property that returns self._data.view(ndarray).)

I couldn't reproduce this behaviour either and I don't know exactly what happened and why it now works. It can very well be an undefined behaviour in numpy C/cython code.