我正处于创建网络刮板的非常初步的阶段。我对Python还是很陌生。我正在尝试从网页中提取星级。这是为了找到页面中所有img alt文本的列表并将其打印到控制台。
url = 'https://www.nhtsa.gov/vehicle/2017/FORD/ESCAPE/SUV/AWD#safety-ratings-frontal' #url to retrieve data from
html = '<div class="col-sm-6"><img src="/sites/nhtsa.dot.gov/themes/nhtsa_gov/images/star-rating/5.png" alt="5 star" class="vehicle-base-details--rating"></div>' #temporary-- for testing
page = urlopen(url)
soup = BeautifulSoup(page, "html.parser")
for div in soup.find_all('div'): #lists all image alt text
for img in div.find_all('img', alt=True):
print(img['alt'])
当我在第4行用“ html”替换“ page”时,BeautifulSoup能够提取我需要的内容并打印“ 5星”。问题是当我尝试直接从网页获取HTML时。我也尝试过按对象的类进行搜索,当直接从站点获取它时,我最终得到一个空列表。
答案 0 :(得分:0)
const canvas = new fabric.Canvas('c')
const box1 = new fabric.Rect({
left: 50,
top: 50,
width: 100,
height: 100,
fill: 'green'
})
const box2 = new fabric.Rect({
left: 250,
top: 250,
width: 100,
height: 100,
fill: 'red'
})
const box1point = box1.getPointByOrigin('center', 'bottom')
const box2point = box2.getPointByOrigin('center', 'top')
const connector = new fabric.Line(
[box1point.x, box1point.y, box2point.x, box2point.y],
{
stroke: "black",
strokeWidth: 3,
lockScalingX: true,
lockScalingY: true,
lockRotation: true,
hasControls: true,
hasBorders: true,
lockMovementX: true,
lockMovementY: true
}
)
box1.on('moving', function() {
const connectPoint = this.getPointByOrigin('center', 'bottom')
connector.set({
x1: connectPoint.x,
y1: connectPoint.y
})
})
box2.on('moving', function() {
const connectPoint = this.getPointByOrigin('center', 'top')
connector.set({
x2: connectPoint.x,
y2: connectPoint.y
})
})
canvas.add(box1, box2, connector)