如果所有值都在同一列中,如何从csv文件中读取数据?

时间:2017-08-05 09:52:05

标签: python list pandas csv

我有一个csv文件,格式如下:

"age","job","marital","education","default","balance","housing","loan"
58,"management","married","tertiary","no",2143,"yes","no"
44,"technician","single","secondary","no",29,"yes","no"

但是,它们不是由制表符(不同的列)分隔,而是位于相同的第一列中。当我尝试使用pandas读取它时,输出会在同一列表中提供所有值,而不是列表列表。

我的代码:

dataframe = pd.read_csv("marketing-data.csv", header = 0, sep= ",")
dataset = dataframe.values
print(dataset)

O / P:

[[58 'management' 'married' ..., 2143 'yes' 'no']
 [44 'technician' 'single' ..., 29 'yes' 'no']]

我需要什么:

[[58, 'management', 'married', ..., 2143, 'yes', 'no']
 [44 ,'technician', 'single', ..., 29, 'yes', 'no']]

我错过了什么?

2 个答案:

答案 0 :(得分:2)

I think you are confused by the print() output which doesn't show commas.

Demo:

In [1]: df = pd.read_csv(filename)

Pandas representation:

In [2]: df
Out[2]:
   age         job  marital  education default  balance housing loan
0   58  management  married   tertiary      no     2143     yes   no
1   44  technician   single  secondary      no       29     yes   no

Numpy representation:

In [3]: df.values
Out[3]:
array([[58, 'management', 'married', 'tertiary', 'no', 2143, 'yes', 'no'],
       [44, 'technician', 'single', 'secondary', 'no', 29, 'yes', 'no']], dtype=object)

Numpy string representation (result of print(numpy_array)):

In [4]: print(df.values)
[[58 'management' 'married' 'tertiary' 'no' 2143 'yes' 'no']
 [44 'technician' 'single' 'secondary' 'no' 29 'yes' 'no']]

Conclusion: your CSV file has been parsed correctly.

答案 1 :(得分:1)

I don't really see a difference between what you want and what you get.. but parsing the csv file with the built in csv module give your desired result

import csv
with open('file.csv', 'rb') as csvfile:
     spamreader = csv.reader(csvfile, delimiter=',', quotechar='|')
     print list(spamreader)

[

['age', 'job', 'marital', 'education', 'default', 'balance', 'housing', 'loan'],

['58', 'management', 'married', 'tertiary', 'no', '2143', 'yes', 'no'],

['44', 'technician', 'single', 'secondary', 'no', '29', 'yes', 'no']

]