我必须创建几个包含从网址中获取的几个json文件的数据集。
我设法以我需要的格式导入一个:
url = "https://cws01.worldstores.co.uk/api/product.php?product_sku=125T:FT0111"
data = urllib2.urlopen(url).read()
data = json.loads(data)
data = pd.DataFrame(data.items())
data = data.transpose()
data.columns = data.iloc[0]
data = data.drop(data.index[[0]])
因为我有一长串网址,我需要的是一个for循环,它为所有这些代码重复这段代码。我的尝试是:
for i in urls:
data = urllib2.urlopen(str(i)).read()
data = json.loads(data)
data = pd.DataFrame(data.items())
data = test.transpose()
data.columns = data.iloc[0]
data = data.drop(data.index[[0]])
df.append(data)
其中url是包含地址的字符串列表,即
"https://cws01.worldstores.co.uk/api/product.php?product_sku=125T:FT0111"
和df是一个空数据框,其列与for循环中每个网址生成的数据框中的列相同
当我运行它时,我不断收到以下错误:
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
我运行单个网址的第一段代码时没有出现的错误。 我做错了什么?
编辑:
新的尝试是按如下方式更改for循环:
for i in urls:
data = urllib2.urlopen(str(i)).read()
try:
data = json.loads(data)
except:
print(data)
print(i)
exit(-1)
data = pd.DataFrame(data.items())
data = data.transpose()
data.columns = data.iloc[0]
data = data.drop(data.index[[0]])
df.append(data)
现在我收到了错误:
data = pd.DataFrame(data.items())
AttributeError: 'str' object has no attribute 'items'
答案 0 :(得分:1)
或者你可以使用pandas native read_json function
import urllib2
import pandas as pd
url_base = "https://cws01.worldstores.co.uk/api/product.php?product_sku={}"
products = ["125T:FT0111", "125T:FT0111", "125T:FT0111"]
raw_data_list = []
for sku in products:
url = url_base.format(sku)
try:
raw_data = urllib2.urlopen(url).read()
if raw_data != "":
raw_data_list.append(raw_data)
except:
pass
data = "[" + (",".join(raw_data_list)) + "]"
data = pd.read_json(data, orient='records')
data
答案 1 :(得分:0)
因为您在for循环中缺少json.loads()行
url = "https://cws01.worldstores.co.uk/api/product.php?
product_sku=125T:FT0111"
data = urllib2.urlopen(url).read()
data = json.loads(data)
data = pd.DataFrame(data.items())
data = data.transpose()
data.columns = data.iloc[0]
data = data.drop(data.index[[0]])
for i in urls:
data = urllib2.urlopen(str(i)).read()
data = json.loads(data) # <- ADDED
data = pd.DataFrame(data.items())
data = test.transpose()
data.columns = data.iloc[0]
data = data.drop(data.index[[0]])
df.append(data)