从CMS NPI数据查询JSON数据

时间:2018-07-05 01:27:43

标签: python-2.7 api

将其B撞几天,可以使用叫醒服务! CMS(医疗保险和医疗补助服务中心)提供了一个API,用于根据个人的NPI(国家医疗服务提供者标识符)访问医生信息。

这里有很多信息,包括每月每月下载大文件等,但是我不需要任何信息。我只需要对已通过资格预审的单个NPI发出查询(低容量),并从检索到的记录中返回一些值。

以下是针对随机选择的NPI的示例查询- https://npiregistry.cms.hhs.gov/api/resultsDemo2/?number=1881761864&pretty=on

如果在浏览器窗口中运行此命令,则会看到封装在某些页眉/页脚HTML中的结果JSON数据。

我可以转储整个查询结果,并以几种不同的方式打印它,但是还不能挑选和打印特定的数据元素,例如姓名,地址或电话号码。如果您在浏览器中运行查询,则可以看到原始输出,下面的代码段将打印出经过清理的结果。见下文。想法?

import urllib
from bs4 import BeautifulSoup
import json

def main():

url = "https://npiregistry.cms.hhs.gov/api/resultsDemo2/?number=1881761864&pretty=on"
html = urllib.urlopen(url).read()
soup = BeautifulSoup(html,"lxml")

for script in soup(["script", "style"]):
    script.extract()

practitioner_rec = soup.get_text()

# strip out the html to retain the JSON record
lines = (line.strip() for line in practitioner_rec.splitlines())
chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
practitioner_rec = '\n'.join(chunk for chunk in chunks if chunk)

# get a count of lines generated by the query, valid queries are greater than 3 lines long
linect = practitioner_rec.count('\n') +1

if linect == 3:
    VALID_NPI="FALSE"
    VALID_MD="FALSE"
else: # approx. 69 lines of output here
    #   possible issue with JSON formmatting here
    #   In particular, the line   
    #   "result_count":1, "results":[
    #   since result count will always be 1, discard it
    practitioner_rec = practitioner_rec.replace('"result_count":1, ', '')
    print(practitioner_rec)

    practitioner_data = json.loads(practitioner_rec)
    VALID_NPI="TRUE"
    VALID_MD="TRUE"

    '''
    none of these constructs works to print the provider name
    print ['result_count']['results']['basic']['name'],"name"
    print result_count['results']['basic']['name'],"name"
    print practitioner_data['results']['basic']['name'],"name"
    print results['basic']['name'],"name"
    print ['basic']['name'],"name"
    print basic['name'],"name"
    print results[2]['basic']['name'],"name"
    print results['basic']['name'],"name"

    this works, but not useful if I can't pick values out
    print(json.dumps(practitioner_data))      

    print "VALID_NPI is ",VALID_NPI
    print "VALID_MD is  ",VALID_MD
    return [VALID_NPI,VALID_MD]    
    '''


if __name__ == '__main__':
    main()

3 个答案:

答案 0 :(得分:0)

headbangin'结束了。我wuz一个json新手,现在我是一个介绍人。这是一个简短的代码段,以防其他人想从CMS NPI JSON数据中查询并获取结果。无需商业API。我看到的Bloom似乎并没有活跃,而其他人则需要注册和跟踪数据才能访问公共数据。

这是访问单个字段的代码-

import urllib
from bs4 import BeautifulSoup
import json

def main():

'''
NOTES:
1.  pretty switch works set to either true OR on
2.  a failed NPI search produces 3-line output like this --
    {
    "result_count":0, "results":[
    ]}
'''

# valid NPI
url = "https://npiregistry.cms.hhs.gov/api/resultsDemo2/?number=1881761864&pretty=on"
html = urllib.urlopen(url).read()
soup = BeautifulSoup(html,"lxml")
# remove HTML from output, producing just a JSON record    
practitioner_rec = soup.text

# count lines generated by the query, valid queries are > than 3 lines long
linect = practitioner_rec.count('\n') +1
#print "there are ", linect," lines in the input file" # only for testing

if linect == 3:
    VALID_NPI="FALSE"
    VALID_MD="FALSE"
else: 
    '''
    query produces a single result, with approx. 60+ lines of output
    JSON data a little squirrelly, so we have to 
    '''
    practitioner_rec = practitioner_rec.replace('"result_count":1, ', '')
    # print(practitioner_rec) # only for testing

    provider_dict = json.loads(practitioner_rec)
    provider_info = provider_dict['results'][0]['basic']
    print("name:", str(provider_info['name'])) # str-strip out unicode tag

    VALID_NPI="TRUE"
    VALID_MD="TRUE"

print "VALID_NPI is ",VALID_NPI
print "VALID_MD is  ",VALID_MD
return [VALID_NPI,VALID_MD]    

if __name__ == '__main__':
    main()

答案 1 :(得分:0)

在SimpleTalk.com Red Gate网站上有几种方法可以做到这一点:

https://www.red-gate.com/simple-talk/blogs/consuming-hierarchical-json-documents-sql-server-using-openjson/

答案 2 :(得分:0)

以下代码适用于我在 python 3.8 中的工作:

import requests
import json
test_npi = ['1003849050', '1114119187', None, '1316935836', '1649595216','666','555']
vval = 0
nval = 0
invalidnpi = []
validnpi = []
for n in test_npi:
    r = requests.get(f'https://npiregistry.cms.hhs.gov/api//resultsDemo2/?version=2.1&number={n}&pretty=on')
    results_text =json.loads( r.text)
    try: 
        print(results_text['result_count']) # This will be always 1 if NPI is valid
        print(f'NPI number: {results_text["results"][0][ "created_epoch"]}')
        print(f'Name     : {results_text["results"][0][ "basic"]["name"]}')
        print(f'NPI number: {results_text["results"][0][ "basic"]["last_name"]}')
        print(f'Phone Number: {results_text["results"][0]["addresses"][1]["telephone_number"]}') # From Primary Practice Address
        vval = vval+results_text['result_count']
        validnpi.append(n)
#         print(f'Last Name: {results_text["results"][0][ "basic"]['last_name']}')
    except:
        nval = nval+1
#         print(json.loads(results_text)['result_count'])
        invalidnpi.append(n)
        print(f'{n} is invalid NPI')
print (f'Number of Invalid NPI: {nval}\n Number of Valid NPI: {vval}')
print (f'List of invlid NPI: {invalidnpi}')
print (f'List of invlaid NPI:{validnpi}')

enter image description here