将JSON数据解析为Python

时间:2013-12-09 22:51:58

标签: python json web-scraping

我有一个JSON文件,其中包含我需要的一些数据。我想编写一个python程序来读取它并获取信息。有什么帮助吗?

以下是JSON的示例

var case_data = {
  "cases": {
    "1": {
      "amount": 1500.0, 
      "case_id": "1", 
      "case_name": "US v. Control Systems Specialist, Inc. and Darrold Richard Crites", 
      "country": "br", 
      "sector": "sector-defense"
    }, 
    "10": {
      "amount": 0.0, 
      "case_id": "10", 
      "case_name": "SEC v. Int'l Systems & Controls Corp.", 
      "country": "cl", 
      "sector": "sector-agriculture"
    }
  }, 
  "countries": {
    "ae": {
      "cases": [
        "191", 
        "192", 
        "193", 
        "282", 
        "332"
      ], 
      "sectors": {
        "sector-consulting": {
          "total": 1812113.33
        }, 
        "sector-energy": {
          "total": 6622147.0
        }, 
        "sector-infrastructure": {
          "total": 4694551.0
        }
      }, 
      "total": 13128811.33, 
      "tree": {
        "children": [
          {
            "children": [], 
            "classname": "sector-energy", 
            "data": {
              "$amount": 4550000.0, 
              "$area": 3
            }, 
            "id": "191", 
            "name": "Control Components Inc. et al."
          }, 
          {
            "children": [], 
            "classname": "sector-infrastructure", 
            "data": {
              "$amount": 140551.0, 
              "$area": 2
            }, 
            "id": "192", 
            "name": "Textron Inc."
          }, 
          {
            "children": [], 
            "classname": "sector-infrastructure", 
            "data": {
              "$amount": 4554000.0, 
              "$area": 3
            }, 
            "id": "193", 
            "name": "York International Corp."
          }, 
          {
            "children": [], 
            "classname": "sector-consulting", 
            "data": {
              "$amount": 1812113.33, 
              "$area": 3
            }, 
            "id": "282", 
            "name": "Aon Corporation"
          }, 
          {
            "children": [], 
            "classname": "sector-energy", 
            "data": {
              "$amount": 2072147.0, 
              "$area": 3
            }, 
            "id": "332", 
            "name": "Tyco Int\u2019l Ltd. et al."
          }
        ], 
        "data": {
          "$amount": 0, 
          "$area": 14
        }, 
        "id": "ae", 
        "name": "UAE"
      }
    }, 
    "ao": {
      "cases": [
        "5", 
        "9", 
        "207", 
        "208", 
        "209"
      ], 
      "sectors": {
        "sector-consulting": {
          "total": 12350000.0
        }, 
        "sector-energy": {
          "total": 18097043.0
        }, 
        "sector-telecom": {
          "total": 7080000.0
        }
      }, 
      "total": 37527043.0, 
      "tree": {
        "children": [
          {
            "children": [], 
            "classname": "sector-energy", 
            "data": {
              "$amount": 302043.0, 
              "$area": 2
            }, 
            "id": "5", 
            "name": "ABB Ltd. et al."
          }, 
          {
            "children": [], 
            "classname": "sector-energy", 
            "data": {
              "$amount": 16335000.0, 
              "$area": 6
            }, 
            "id": "9", 
            "name": "Baker Hughes Inc. et al."
          }, 
          {
            "children": [], 
            "classname": "sector-energy", 
            "data": {
              "$amount": 1460000.0, 
              "$area": 3
            }, 
            "id": "207", 
            "name": "GlobalSanteFe Corp."
          }, 
          {
            "children": [], 
            "classname": "sector-telecom", 
            "data": {
              "$amount": 7080000.0, 
              "$area": 3
            }, 
            "id": "208", 
            "name": "Alcatel-Lucent S.A. et al."
          }, 
          {
            "children": [], 
            "classname": "sector-consulting", 
            "data": {
              "$amount": 12350000.0, 
              "$area": 6
            }, 
            "id": "209", 
            "name": "Panalpina World Transport (Holding) Ltd. et al."
          }
        ], 
        "data": {
          "$amount": 0, 
          "$area": 20
        }, 
        "id": "ao", 
        "name": "Angola"
      }
    }, 
  };

我想在每个国家的“扇区能量”之后提取数字。请注意,在此示例文件中,有两个国家/地区'ae'和'ao'。

2 个答案:

答案 0 :(得分:3)

如果你真的有JSON。您所要做的就是使用json模块将其解码为本机dict,就像在JavaScript中使用JSON对象将其解码为本机{{1}一样}}。比较这个JS:

object

...等效的Python:

var case_data = JSON.parse(data);

一旦你做完了,就不再需要担心JSON了;它只是普通的原生对象,您可以像访问任何其他组合的词典和列表以及字符串和数字一样访问它。例如:

case_data = json.loads(data)

然而,你向我们展示的并不是JSON;它是JavaScript源代码,它将复杂对象分配给变量。你无法用任何语言将其解析为JSON,因为它不是。

当然,sector_energy = [country["sectors"]["sector-energy"] for country in case_data["countries"]] =之间的源代码部分不仅是有效的JavaScript代码,还有效的JSON。就此而言,它也是有效的Python和有效的Ruby。但是如果你想要解析这个文件和其他类似的文件,你需要提出你要解析的代表JSON的片段的规则。每个文件只是一个JS变量赋值吗?还是别的什么?

无论如何,实际上使用JSON进行语言之间的交换几乎总是好得多,而不是使用类似JSON的东西,并希望它可以工作。

答案 1 :(得分:0)

Python已经有了这个

的库

http://docs.python.org/3.3/library/json.html