我正试图从这个filter
中删除数据。该页面使用了大量的javascript,我正在努力获得href和赔率。
我目前的输出如下:
我希望我的输出看起来类似于......(绿色): website
class BlueBet(scrapy.Spider):
name = "BlueBet"
start_urls = ['https://www.bluebet.com.au/api/sports/SportsMasterCategory?withLevelledMarkets=true&id=100']
custom_settings = {
'FEED_FORMAT': 'csv',
'FEED_URI': 'odds.csv',
'FEED_EXPORT_ENCODING': 'utf-8',
}
def parse(self, response):
data = json.loads(response.body)
for master_category in data['MasterCategories']:
for category in master_category['Categories']:
for event in category['MasterEvents']:
item = {}
item['Event_name'] = event.get('MasterEventName')
item['Outcomes'] = {}
try:
for market in event['Markets']:
item['Outcomes'][market.get('OutcomeName')] = market.get('Price')
except TypeError:
continue
yield item
答案 0 :(得分:0)
你得到这样的数据
{
'Event_name': 'Melbourne Victory v Adelaide United',
'Outcomes': {
'Melbourne Victory': 2.05,
'Draw': 3.5,
'Adelaide United': 3.4
}
}
并且您希望将Outcomes
拆分为单独的列。
但是列需要scrapy Item
中的名字
我将使用名称key1
,val1
,key2
,val2
,key3
,val3
data = {'Event_name': 'Melbourne Victory v Adelaide United',
'Outcomes': {'Melbourne Victory': 2.05, 'Draw': 3.5, 'Adelaide United': 3.4}}
# ---
item = {'Event_name': data['Event_name']}
for number, (key, val) in enumerate(data['Outcomes'].items(), 1):
number = str(number)
print(number, key, val)
item["key"+number] = key
item["val"+number] = val
print(item)
,这给出了项目
{
'Event_name': 'Melbourne Victory v Adelaide United',
'key1': 'Melbourne Victory',
'val1': 2.05,
'key2': 'Draw',
'val2': 3.5,
'key3': 'Adelaide United',
'val3': 3.4
}
这应该为您提供Excel中分隔列的数据。