Question

我正在通过从URL抓取的患者元数据进行解析，并且试图访问'PatientID'字段。但是，还有一个'OtherPatientIDs'字段，我的搜索会抓住它。

我曾尝试使用正则表达式，但不清楚如何匹配EXACT字符串或如何将其合并到我的代码中。

目前，我已经完成：

response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

PatientID = "PatientID"

lines = soup.decode('utf8').split("\n")
for line in lines:
    if "PatientID" in line:
        PatientID = line.split(':')[1].split('\"')[1].split('\"')[0]
        print(PatientID)

成功找到PatientID和OtherPatientIDs字段的值。如何指定我只需要PatientID字段？

编辑：我被要求举一个例子，说明我使用response.text会得到什么，其形式为：

{
    "ID" : "shqowihdojcoughwoeh"
    "LastUpdate: "20190507"
    "MainTags" : {
         "OtherPatientIDs" : "0304992098"
         "PatientBirthDate" : "29/04/1803"
         "PatientID" : "92879837"
         "PatientName" : "LASTNAME^FIRSTNAME"
     },
     "Type" : "Patient"
}

Answer 1

为什么不使用json库呢？

import json
import requests

response = requests.get(url)
data = json.loads(response.text)

print(data['MainTags']['PatientID'])

如何匹配json汤中的确切单词？

1 个答案: