从BeautifulSoup结果集中提取数据

时间:2016-12-29 12:50:47

标签: beautifulsoup resultset

我过滤了这个标签 同 data = soup.findAll('div',{'id':'responseDiv'}) 得到了这个。

{“valid”:“true”,“isinCode”:null,“lastUpdateTime”:“29-DEC-2016 12:19:23”,“ocLink”:“/ marketinfo / sym_map / symbolMapping.jsp?symbol = NIFTY&安培;仪器= - &安培;日期= - &安培; segmentLink = 17&安培;符号计数= 2" , “tradedDate”: “29DEC2016”, “数据”:[{ “变化”: “18.65”, “sellPrice1”:“8,050.90 ”, “buyQuantity3”: “75”, “sellPrice2”: “8,050.95”, “buyQuantity4”: “225”, “buyQuantity1”: “750”, “LTP”: “ - ”, “buyQuantity2”: “150”, “sellPrice5”: “8,051.15”, “sellPrice3”: “8,051.00”, “buyQuantity5”: “675”, “sellPrice4”: “8,051.05”, “下面的”: “NIFTY”, “bestSell”: “ - ”,“annualisedVolatility “:” 16.61" , “optionType”: “ - ”, “prevClose”: “8,031.35”, “pChange”: “0.23”, “lastPrice”: “8,050.00”, “lowPrice”: “8,025.00”, “strikePrice”: “ - ”, “premiumTurnover”: “ - ”, “numberOfContractsTraded”: “54112”, “underlyingValue”: “8,055.20”, “openInterest”: “1,03,46,700”, “隐含波动性”: “ - ”, “成交量加权平均价”: “8,046.98”下, “totalBuyQuantity”: “5,20,350”, “openPrice”: “8,028.00”, “closePrice”: “0.00”, “BESTBUY”: “ - ”, “changeinOpenInterest” “ - 2,11,050”, “clientWisePositionLimits”:“29 320076" , “totalSellQuantity”: “9,75,675”, “dailyVolatility”: “0.87”, “sellQuantity5”: “225”, “marketLot”: “75”, “expiryDate”: “29DEC2016”, “marketWidePositionLimits”:” - ”, “sellQuantity2”: “150”, “sellQuantity1”: “75”, “buyPrice1”: “8,050.00”, “sellQuantity4”: “150”, “buyPrice2”: “8,049.80”, “sellQuantity3”: “450” “buyPrice4”: “8,049.30”, “buyPrice3”: “8,049.35”, “buyPrice5”: “8,049.15”, “turnoverinRsLakhs”: “3,26,578.64”, “pchangeinOpenInterest”: “ - 2.00”, “settlementPrice”:“8031.35 “,”instrumentType“:”FUTIDX“,”highPrice“:”8,060.00“}],”companyName“:”Nifty 50“,”eqLink“:”“}

我想以粗体提取文本。我只是将整个事物转换为字符串并通过索引调用。我确信有一种正确的方法来转换结果集

1 个答案:

答案 0 :(得分:0)

您的问题有点不清楚,需要编辑,但该回复看起来像json。您可以使用

加载它
import json

...
data = soup.findAll('div',{'id':'responseDiv'})

假设你真正从findAll得到的是一个包含json文本的元素的列表。

extracted = json.loads(data[0].getText())
print(extracted['data'][0]['vwap'])
  

8,046.98

' vwap'你试图提取可以访问,例如像那样。 extracted是一个包含密钥'data'的列表的字典,该列表的第0个元素是字典,而在其中包含关键字'vwap'的是信息。