我是第一次使用Web scraper,我使用Beautiful Soup来解析JSON文件并返回我发送给CSV的几个属性。
JSON数组中的status
变量是二进制值(0/1)。我只想返回状态为0的数组。这样做是否可行?
"""soup = BeautifulSoup(html)
table = soup.find()
print soup.prettify()"""
js_data = json.loads(html)
Attraction = []
event = []
status = []
for doc in js_data["response"]["docs"]:
Attraction.append(doc["Attraction"])
event.append(doc["PostProcessedData"]["Onsales"]["event"]["date"])
status.append(doc["PostProcessedData"]["Onsales"]["status"])
with open("out.csv","w") as f:
datas = zip(Attraction,event,status)
keys = ["Attraction","event","status"]
f.write(";".join(keys))
for data in datas:
f.write(",".join([str(k).replace(",",";").replace("<br>"," ") for k in data]))
f.write("\n")
答案 0 :(得分:0)
我可能会遗漏一些东西,但也许这会有所帮助:
for doc in js_data["response"]["docs"]:
if doc["PostProcessedData"]["Onsales"]["status"] == "0":
Attraction.append(doc["Attraction"])
event.append(doc["PostProcessedData"]["Onsales"]["event"]["date"])
status.append(doc["PostProcessedData"]["Onsales"]["status"])