Question

我试图打开一个跨越多行的标头的csv文件。为避免处理MultiIndex，我使用header参数跳过某些行，但所有值都变为NaN。

重现错误的示例：

,,
x,a,c
y,b,d
labels,l1,l2
2016-01-01,1,6
2016-01-02,2.0,7.0
2016-01-03,3.0,8

test.csv

t = pandas.read_csv('test.csv',skiprows=3, header=[0], index_col=[0]

或

t = pandas.read_csv('test.csv', header=[3], index_col=[0] )

产生相同的输出

labels       l1   l2
2016-01-01  NaN  NaN
2016-01-02  NaN  NaN
2016-01-03  NaN  NaN

[3 rows x 2 columns]

当我使用所有3个标题行时

t = pandas.read_csv('test.csv', header=[1,2,3], index_col=[0] )

它可以工作，我可以访问数据。

我错过了什么或这是一个错误吗？

ps：我现在正在使用MultiIndex，我遇到了一个问题，因为我忘记了一个参数（标题有8行......）

Answer 1

怎么样：

my_file = 'test.csv'
df = pd.read_csv(my_file, sep=',', names=['labels', 'l1', 'l2'], skiprows=4, header=None)

完全忘掉前4行并自己指定标题。

Answer 2

试试这个：

In [20]: pd.read_csv(filename, skiprows=3)
Out[20]:
       labels   l1   l2
0  2016-01-01  1.0  6.0
1  2016-01-02  2.0  7.0
2  2016-01-03  3.0  8.0