Question

我有一个格式如下的CSV文件：

somefeature,anotherfeature,f3,f4,f5,f6,f7,lastfeature
0,0,0,1,1,2,4,5

我尝试将其视为熊猫系列（使用Python 2.7的pandas每日快照）。我尝试了以下方法：

import pandas as pd
types = pd.Series.from_csv('csvfile.txt', index_col=False, header=0)

和

types = pd.read_csv('csvfile.txt', index_col=False, header=0, squeeze=True)

但两者都不起作用：第一个给出随机结果，第二个只是导入一个DataFrame而不会挤压。

似乎pandas只能识别为系列格式如下的CSV：

f1, value
f2, value2
f3, value3

但是当功能键位于第一行而不是列时，pandas不想挤压它。

我还能尝试别的吗？这种行为是否打算？

Answer 1

这是我找到的方式：

df = pandas.read_csv('csvfile.txt', index_col=False, header=0);
serie = df.ix[0,:]

对我来说似乎有点愚蠢，因为Squeeze应该已经这样做了。这是一个错误还是我错过了什么？

/编辑：最佳方式：

df = pandas.read_csv('csvfile.txt', index_col=False, header=0);
serie = df.transpose()[0] # here we convert the DataFrame into a Serie

这是将面向行的CSV行放入pandas系列的最稳定方法。

BTW，squeeze = True参数暂时没用，因为截至今天（2013年4月）它只适用于面向行的CSV文件，请参阅官方文档：

http://pandas.pydata.org/pandas-docs/dev/io.html#returning-series

Answer 2

In [28]: df = pd.read_csv('csvfile.csv')

In [29]: df.ix[0]
Out[29]: 
somefeature       0
anotherfeature    0
f3                0
f4                1
f5                1
f6                2
f7                4
lastfeature       5
Name: 0, dtype: int64

Answer 3

这有效。压缩仍然可以工作，但它不会单独工作。 index_col需要设置为零，如下所示：

series = pd.read_csv('csvfile.csv', header = None, index_col = 0, squeeze = True)

Answer 4

ds = pandas.read_csv('csvfile.csv', index_col=False, header=0);    
X = ds.iloc[:, :10] #ix deprecated

Answer 5

作为 Pandas 的值选择逻辑是：

DataFrame -> Series=DataFrame[Column] -> Values=Series[Index]

所以我建议：

df=pandas.read_csv("csvfile.csv")
s=df[df.columns[0]]

如何从CSV文件中读取pandas系列

5 个答案: