无法将整个JSON文件导入Google Colab

时间:2020-02-04 02:41:19

标签: python pandas dataframe google-colaboratory

我正在尝试将json文件从github导入到Google colab。它可以工作,但不能从文件中读取所有列。这是我的代码:

import pandas as pd
url = 'https://raw.githubusercontent.com/lequanngo/WorldHappiness/master/WorldHappiness.json'
df = pd.read_json(url, orient='columns')
df.head(10)

这是结果:

country||ladder||ladderSD||Positive_affect||Negative_affect||SocialSupport||Freedom
Finland| |1| |4| |41| |10| |2| |5| 
Denmark
Norway
etc

',country,ladder,ladder_sd,positive_affect,negative_affect,social_support,freedom,corruption,generosity,gdp_per_capita,healthy_life_expectancy,continent\n0,Finland,1,4,41,10,2,5,4,47,22,27,Europe\n1'

显示所有11列(国家,阶梯,阶梯SD,positve_affect,negative_affect等)。但是当我通过使用

获得描述性统计数据时
df.describe()
      |ladder|  |ladderSD|
count  156       156
mean   78.5      78.5
std
min
25%

仅计算梯形图和梯形图SD。不考虑positive_affect和negative_affect以及其他所有连续数据列。

有人可以帮我吗?

1 个答案:

答案 0 :(得分:0)

这是您期望的输出吗?

>>> url = 'https://raw.githubusercontent.com/lequanngo/WorldHappiness/master/WorldHappiness.json'
>>> df = pd.read_json(url, orient='records', dtype='dict')
>>> df.head()                                                                                                                                                   

  Country (region)  Ladder  SD of Ladder Positive affect Negative affect  ... Freedom Corruption Generosity Log of GDP\nper capita Healthy life\nexpectancy
0          Finland       1             4              41              10  ...       5          4         47                     22                       27
1          Denmark       2            13              24              26  ...       6          3         22                     14                       23
2           Norway       3             8              16              29  ...       3          8         11                      7                       12
3          Iceland       4             9               3               3  ...       7         45          3                     15                       13
4      Netherlands       5             1              12              25  ...      19         12          7                     12                       18

[5 rows x 11 columns]

>>> df.describe()                                                                                                                                               

           Ladder  SD of Ladder
count  156.000000    156.000000
mean    78.500000     78.500000
std     45.177428     45.177428
min      1.000000      1.000000
25%     39.750000     39.750000
50%     78.500000     78.500000
75%    117.250000    117.250000
max    156.000000    156.000000

>>> df.describe(include='all')                                                                                                                                  

       Country (region)      Ladder  SD of Ladder  Positive affect  Negative affect  ...  Freedom  Corruption Generosity  Log of GDP\nper capita Healthy life\nexpectancy
count               156  156.000000    156.000000            156.0            156.0  ...    156.0         156      156.0                     156                      156
unique              156         NaN           NaN            156.0            156.0  ...    156.0         149      156.0                     153                      151
top               Nepal         NaN           NaN            155.0            155.0  ...    155.0                  155.0                                                 
freq                  1         NaN           NaN              1.0              1.0  ...      1.0           8        1.0                       4                        6
mean                NaN   78.500000     78.500000              NaN              NaN  ...      NaN         NaN        NaN                     NaN                      NaN
std                 NaN   45.177428     45.177428              NaN              NaN  ...      NaN         NaN        NaN                     NaN                      NaN
min                 NaN    1.000000      1.000000              NaN              NaN  ...      NaN         NaN        NaN                     NaN                      NaN
25%                 NaN   39.750000     39.750000              NaN              NaN  ...      NaN         NaN        NaN                     NaN                      NaN
50%                 NaN   78.500000     78.500000              NaN              NaN  ...      NaN         NaN        NaN                     NaN                      NaN
75%                 NaN  117.250000    117.250000              NaN              NaN  ...      NaN         NaN        NaN                     NaN                      NaN
max                 NaN  156.000000    156.000000              NaN              NaN  ...      NaN         NaN        NaN                     NaN                      NaN

[11 rows x 11 columns]