Question

我浑身无力，但无法解决这个问题。

数字，名字都是虚构的。但这个想法是这样的

我读了一个像'https://graph.facebook.com/123'

这样的链接

这导致源代码：

{
   "id": "123",
   "name": "John Doe",
   "first_name": "John",
   "last_name": "Doe",
   "link": "http://www.facebook.com/people/John-Doe/123",
   "gender": "male",
   "locale": "en_US"
}

我想提取id，name等的所有信息。

我尝试了这个，但它失败了

    link = 'https://graph.facebook.com/123'
    result = browser.open(link)
    text = result.read()
    result.close()
    id = re.search('"id": "(.*?)",', cont)

regex'“id”：“（。*？）”，“似乎是正确的，但不返回后面..为什么???

Answer 1

这似乎是JSON，你不想用正则表达式来解析它。

link = 'https://graph.facebook.com/123'
result = browser.open(link)
data = json.load(result)
print data['id']

如何在python中使用正则表达式重新搜索html源代码

1 个答案: