我正在抓取https://www.open2study.com/courses的图像 我得到了所有的图像源,但不知道如何在html文件上的2列(一列标题和一图像)的桌子上显示图像(而不是链接)。可以帮助我吗?
import urllib
from bs4 import BeautifulSoup
titles = []
images = []
r = urllib.urlopen('https://www.open2study.com/courses').read()
soup = BeautifulSoup(r)
for i in soup.find_all('div', {'class': "courses_adblock_rollover"}):
titles.append(i.h2.text)
for i in soup.find_all(
'img', {
'class': "image-style-course-logo-subjects-block"}):
images.append(i.get('src'))
with open('test.txt', "w") as f:
for i in zip(titles, images):
f.write(i[0].encode('ascii', 'ignore') +
'\n'+i[1].encode('ascii', 'ignore') +
'\n\n')
header = '<!doctyle html><html><head><title>My Title</title></head><body>'
body = '<table><thead><tr><th></th><th></th></tr>'
footer = '</table></body></html>'
img_tag = '<img src=,{}">'
with open('test.txt', 'r') as input, open('test.html', 'w') as output:
output.write(header)
output.write(body)
for line in input:
col1 = line.rstrip().split()
col2 = line.rstrip().split()
output.write('<tr><td>{}</td><td>{}</td></tr>\n'.format(col1, col2))
output.write(footer)
答案 0 :(得分:1)
这是一个非常简单的问题。这个问题
for line in input:
#ignore blank lines
if line == '\n':
continue
#why were you spliting here?
col1 = line.rstrip()
#read next line
col2 = next(input).rstrip()
output.write('<tr><td>{}</td><td><img src="{}" style="width: 160px; height: 100px"></td></tr>\n'.format(col1, col2))