如何将此网站的表格转换为数据框?

时间:2017-01-27 16:38:20

标签: python pandas web-scraping data-analysis

我有来自福布斯的这个链接.. http://www.forbes.com/global2000/list/。我需要将2000强公司的表格纳入数据框进行分析。我该怎么做?

1 个答案:

答案 0 :(得分:3)

您可以直接使用pd.read_json,因为基础表是从json resposne生成的。

提示:检查浏览器的网络标签,xhr请求url

In [38]: df = pd.read_json('http://www.forbes.com/ajax/list/data?year=2016&uri=glo
    ...: bal2000&type=organization')

In [40]: df.shape
Out[40]: (2001, 16)

In [41]: df.head(2)
Out[41]:
    assets            ceo         country    headquarters  imageUri  \
0  32718.0    Inge Thulin   United States       Minnesota        3m
1   7454.0  Simon Borrows  United Kingdom  United Kingdom  3i-group

              industry  marketValue      name  position  profits  rank  \
0        Conglomerates     102175.0        3M       200   4833.0   200
1  Investment Services       6685.0  3i Group      1562    925.0  1562

   revenue squareImage      state thumbnail       uri
0  30274.0         NaN  Minnesota       NaN        3m
1    485.0         NaN        NaN       NaN  3i-group