Python:导入的csv没有被拆分成正确的列

时间:2017-12-19 22:01:11

标签: python pandas csv delimiter

我使用pandas将csv文件导入python,但数据框只在一列中。我将来自The Player Standing Field table at this link (second one)的逗号分隔格式的数据复制并粘贴到excel文件中并将其保存为csv(最初为ms-dos,然后按照AllthingsGo42的建议正常和utf-8)。但它只返回一个列数据帧。

我尝试过的例子:

dataset=pd.read('MLB2016PlayerStats2.csv')

dataset=pd.read('MLB2016PlayerStats2.csv', delimiter=',')

dataset=pd.read_csv('MLB2016PlayerStats2.csv',encoding='ISO-8859-9', 
delimiter=',')

上面的每一行代码都返回了:

RK,姓名,年龄,TM,LG,G,GS,CG,客栈,CH,PO,A,E,DP,FLD%,RTOT,RTOT /年,迪纳普农村服务组织,迪纳普农村服务组织/年,RF / 9, RF / G,Pos摘要 1,Fernando Abad \ abadfe01,30,TOT,AL,57,0,0,46.2 ...
2,Jose Abreu \ abreujo02,29,CHW,AL,152,152,150,1 ...
3,A.J。 Achter \ achteaj01,27,左心耳,AL,27,0,0,37.2,...
4,Dustin Ackley \ ackledu01,28,NYY,AL,23,16,10,1 ...
5,Cristhian Adames \ adamecr01,24,COL,NL,69,43,3 ...

也尝试过:

dataset=pd.read_csv('MLB2016PlayerStats2.csv',encoding='ISO-8859-9', 
delimiter=',',quoting=3)

返回了:

"Rk                        Name  Age   Tm  Lg    G   GS   CG     Inn    Ch  
\
0  "1      Fernando Abad\abadfe01   30  TOT  AL   57    0    0    46.2     4   

1  "2        Jose Abreu\abreujo02   29  CHW  AL  152  152  150  1355.2  1337   

2  "3       A.J. Achter\achteaj01   27  LAA  AL   27    0    0    37.2     6   

3  "4     Dustin Ackley\ackledu01   28  NYY  AL   23   16   10   140.1    97   

4  "5  Cristhian Adames\adamecr01   24  COL  NL   69   43   38   415.0   212   

   E   DP   Fld%  Rtot  Rtot/yr  Rdrs  Rdrs/yr  RF/9  RF/G  \
0      ...        0    1  1.000   NaN      NaN   NaN      NaN  0.77  0.07   
1      ...       10  131  0.993  -2.0     -2.0  -5.0     -4.0  8.81  8.73   
2      ...        0    0  1.000   NaN      NaN   0.0      0.0  1.43  0.22   
3      ...        0    8  1.000   1.0      9.0   3.0     27.0  6.22  4.22   
4      ...        6   24  0.972  -4.0    -12.0   1.0      3.0  4.47  2.99   

Pos Summary"  
0            P"  
1           1B"  
2            P"  
3     1B-OF-2B"  
4     SS-2B-3B"  

以下是notepad ++中的数据

"Rk,Name,Age,Tm,Lg,G,GS,CG,Inn,Ch,PO,A,E,DP,Fld%,Rtot,Rtot/yr,Rdrs,Rdrs/yr,RF/9,RF/G,Pos Summary"
"1,Fernando Abad\abadfe01,30,TOT,AL,57,0,0,46.2,4,0,4,0,1,1.000,,,,,0.77,0.07,P"
"2,Jose Abreu\abreujo02,29,CHW,AL,152,152,150,1355.2,1337,1243,84,10,131,.993,-2,-2,-5,-4,8.81,8.73,1B"
"3,A.J. Achter\achteaj01,27,LAA,AL,27,0,0,37.2,6,2,4,0,0,1.000,,,0,0,1.43,0.22,P"
"4,Dustin Ackley\ackledu01,28,NYY,AL,23,16,10,140.1,97,89,8,0,8,1.000,1,9,3,27,6.22,4.22,1B-OF-2B"
"5,Cristhian Adames\adamecr01,24,COL,NL,69,43,38,415.0,212,68,138,6,24,.972,-4,-12,1,3,4.47,2.99,SS-2B-3B"
"6,Austin Adams\adamsau01,29,CLE,AL,19,0,0,18.1,1,0,0,1,0,.000,,,0,0,0.00,0.00,P"

很抱歉以前对我的问题感到困惑。我希望这个编辑能够解决问题。感谢那些到目前为止回答的人。

2 个答案:

答案 0 :(得分:0)

自己快速运行,我能够得到我理解的是所需的输出。

enter image description here

我唯一想到的是,我没有必要为csv调出分隔符,因为csv是逗号分隔的变量文件,但这无关紧要。我认为你的实际数据文件有些不正确,我会去确保它被正确保存。我会回复之前的评论并确保csv是UTF-8,而不是MS-DOS或Macintosh(在excel中保存时都是选项)

祝你好运!

答案 1 :(得分:0)

无需为csv调用定界符。您只需要将分隔符从“;”更改为至 ”,”。为此,您可以使用记事本打开csv文件,并使用替换工具进行更改。