使用Scrapy从字符串中提取数据

时间:2018-02-12 19:16:21

标签: python scrapy

<string xmlns="http://schemas.microsoft.com/2003/10/Serialization/">
{"InstrumentID":85,"BuyPrice":24677.0,"SellPrice":24671.0,"HighPrice":24671.0,"LowPrice":24212.0,"ChangePercent":2.1,"ChangePercentText":"2.10%","UsersBuyPercentage":56.0,"UsersSellPercentage":44.0,"IsValid":true}
</string>

我需要用Scrapy提取BuyPrice,SellPrice,但我不知道如何。有人可以帮忙吗?

1 个答案:

答案 0 :(得分:1)

看起来你在xml中有json,所以提取数据将是一个由两部分组成的任务:

  1. 提取json字符串
  2. 使用json模块
  3. 加载所需信息

    如何完成此操作的示例(在此使用scrapy shell):

    >>> import json
    >>> sel = scrapy.Selector(text='''<string xmlns="http://schemas.microsoft.com/2003/10/Serialization/">
    ... {"InstrumentID":85,"BuyPrice":24677.0,"SellPrice":24671.0,"HighPrice":24671.0,"LowPrice":24212.0,"ChangePercent":2.1,"ChangePercentText":"2.10%","UsersBuyPercentage":56.0,"UsersSellPercentage":44.0,"IsValid":true}
    ... </string>''')
    >>> sel.remove_namespaces()
    >>> json.loads(sel.xpath('//string/text()').get())
    {'InstrumentID': 85, 'BuyPrice': 24677.0, 'SellPrice': 24671.0, 'HighPrice': 24671.0, 'LowPrice': 24212.0, 'ChangePercent': 2.1, 'ChangePercentText': '2.10%', 'UsersBuyPercentage': 56.0, 'UsersSellPercentage': 44.0, 'IsValid': True}