如何从动态网站python硒中检索表

时间:2019-03-16 05:23:47

标签: python json selenium web-scraping selenium-chromedriver

我想从动态网站上的表中检索所有信息,并且具有以下代码:

/home

但是,我一直收到错误消息,但到目前为止还行不通。如何检索表格信息? 非常感谢!

错误: RowsOfTable = table.get_attribute(“ tr”) AttributeError:“列表”对象没有属性“ get_attribute”

3 个答案:

答案 0 :(得分:1)

这是获取产品详细信息的代码

tableloadwait = (wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".panel-body"))))
driver.find_element_by_xpath("//span[contains(.,'Product Details')]").click()
rows = driver.find_elements_by_xpath("//span[contains(.,'Product Details')]/ancestor::div[@class='accordion-top-border']//tr[(@ng-repeat='attr in attributes' or @ng-repeat='field in fields') and @class='visible-xs']")

for rowNum in range(len(rows)):
    print(rows[rowNum].get_attribute('innerText'))
driver.quit()

我们必须根据您的要求调整值或破坏值。

如果要基于行文本获取数据,请使用以下内容。

upcData = driver.find_element_by_xpath("//strong[.='UPC']/parent::td").get_attribute('innerText').replace('UPC','').replace('\n','').replace('    ','')

答案 1 :(得分:1)

首先使用适当的+按钮扩展手风琴,然后选择表格。添加等待项出现。如果需要另一个表,请将//we bind our HP element to a variable const hp = document.getElementById('hp'); //we bind our button to its own variable const rollForHp = document.getElementById('rollForHealth'); //create an eventlistener and bind it to the button rollForHp.addEventListener('click', healthRoll, false); /* This function will bind prompt input to a variable and change element inner HTML instead of refreshing the whole document */ function healthRoll() { let result = prompt('What is your new HP value?'); hp.innerHTML = `HP: ${result}` } 索引更改为2。

expandSigns

答案 2 :(得分:1)

如果您需要抓取而不是测试,则可以使用请求来获取数据。下面的代码是如何从页面获取数据的示例。

import requests
import re

# Return header page(html) to get token and list key
response = requests.get("http://biggestbook.com/ui/catalog.html#/itemDetail?itemId=HERY4832YER01&uom=CT")

# Get token using regular expression
productRecommToken = re.search("'productRecommToken','(.+)'", response.text)[1]

# Get list of keys using regular expression
listKey = re.search("'listKey',\\['(.*?)'\\]", response.text)[1].split("','")

# Create header with token
headers = {
    'Accept': 'application/json, text/plain, */*',
    'Referer': 'http://biggestbook.com/ui/catalog.html',
    'Origin': 'http://biggestbook.com',
    'DNT': '1',
    'token': productRecommToken,
    'BiggestBook-Handle-Errors-Generically': 'true',
}

# Create parameters with list keys and search values
params = (
    ('listKey', listKey),
    ('uom', 'CT'),
    ('vc', 'n'),
    ('win', 'HERY4832YER01'),
)

# Return json with all details about product
response = requests.get('https://api.essendant.com/digital/digitalservices/search/v1/items',
                       headers=headers,
                       params=params)
data = response.json()

# Get items from json, probably could be more than one
items = data["items"]

# Iterate and get details you need. Check "data" to see all possible details you can get
for i in items:
    print(i["manufacturer"])
    print(i["description"])
    print(i["actualPrice"])

    # Get attributes
    attributes = i["attributes"]

    # Example hot you can get specific one attribute.
    thickness = list(filter(lambda d: d['name'] == 'Thickness', attributes))[0]["value"]

    # Print all attributes as name = value
    for a in attributes:
        print(f"{a['name']} = {a['value']}")