如何将文本文件(带有不需要的双引号)转换为pandas DataFrame?

时间:2015-10-27 01:41:46

标签: python parsing pandas

我需要将基于Web的数据(如下所示)导入Python。我使用了urllib2.urlopendata available here)。但是,数据是作为字符串行导入的。如何在删除双引号DataFrame的同时将它们转换为pandas "?谢谢你的帮助。

"country","country isocode","year","POP","XRAT","tcgdp","cc","cg"
"Argentina","ARG","2000","37335.653","0.9995","295072.21869","75.716805379","5.5788042896"
"Australia","AUS","2000","19053.186","1.72483","541804.6521","67.759025993","6.7200975332"
"India","IND","2000","1006300.297","44.9416","1728144.3748","64.575551328","14.072205773"
"Israel","ISR","2000","6114.57","4.07733","129253.89423","64.436450847","10.266688415"
"Malawi","MWI","2000","11801.505","59.543808333","5026.2217836","74.707624181","11.658954494"
"South Africa","ZAF","2000","45064.098","6.93983","227242.36949","72.718710427","5.7265463933"
"United States","USA","2000","282171.957","1","9898700","72.347054303","6.0324539789"
"Uruguay","URY","2000","3219.793","12.099591667","25255.961693","78.978740282","5.108067988"

1 个答案:

答案 0 :(得分:1)

你可以这样做:

>>> import pandas as pd
>>> df=pd.read_csv('https://raw.githubusercontent.com/QuantEcon/QuantEcon.py/master/data/test_pwt.csv')
>>> df
         country country isocode  year          POP       XRAT  \
0      Argentina             ARG  2000    37335.653   0.999500   
1      Australia             AUS  2000    19053.186   1.724830   
2          India             IND  2000  1006300.297  44.941600   
3         Israel             ISR  2000     6114.570   4.077330   
4         Malawi             MWI  2000    11801.505  59.543808   
5   South Africa             ZAF  2000    45064.098   6.939830   
6  United States             USA  2000   282171.957   1.000000   
7        Uruguay             URY  2000     3219.793  12.099592   

            tcgdp         cc         cg  
0   295072.218690  75.716805   5.578804  
1   541804.652100  67.759026   6.720098  
2  1728144.374800  64.575551  14.072206  
3   129253.894230  64.436451  10.266688  
4     5026.221784  74.707624  11.658954  
5   227242.369490  72.718710   5.726546  
6  9898700.000000  72.347054   6.032454  
7    25255.961693  78.978740   5.108068