我使用python脚本来webscrape“Show Notes”和一个mp3。当我遇到一个没有显示音符的页面时,这意味着该节目是最好的,所以我想跳过笔记和mp3的下载。我不确定插入测试的最佳位置在哪里。摘录如下:
function allPosts()
{
$todos = DB::table('posts')
->select('posts.post','posts.id','posts.user_id','users.name')
->join('users','users.id','=','posts.user_id')
->get();
return view('/posts', [
'all' => $all,
]);
我想是
for show_html in showpage_htmls:
try:
p_html = s.get(show_html)
p_soup = BeautifulSoup(p_html.content, 'html.parser')
# set title for SHOW NOTES
title = ''
title = p_soup.title.contents[0]
# get SHOW NOTES chunk and remove unwanted characters (original mp3notes not changed)
mp3notes = ''
mp3notes = p_soup.find('div', {'class': 'module-text'}).find('div')
mp3notes = str(title) + str('\n') + str(mp3notes).replace('<div>','').replace('<h2>','').replace('</h2>','\n').replace('<p>','').replace('<br/>\n','\n').replace('<br/>','\n').replace('</p>','').replace('</div>','').replace('\u2032','')
# FIXME need to skip d/l if no notes
# set basename, mp3named and mp3showtxt
mp3basename = '{0}{1}{2}'.format(show_html.split('/')[3],show_html.split('/')[4],show_html.split('/')[5])
if (os.name == 'nt'):
mp3showtxt = mp3dir + '\\' + mp3basename + '.txt'
mp3named = mp3dir + '\\' + mp3basename + '.mp3'
else:
mp3showtxt = mp3dir + '/' + mp3basename + '.txt'
mp3named = mp3dir + '/' + mp3basename + '.mp3'
# save show notes to local
with open(mp3showtxt, 'w') as f:
try:
f.write(mp3notes)
print("Show notes " + mp3basename + " saved.")
except UnicodeEncodeError:
print("A charmap encoding ERROR occurred.")
print("Show notes for " + mp3basename + ".mp3 FAILED, but continuing")
finally:
f.close()
# FIXME need eyed3 to set mp3 tags since B&T are lazy
# get Full Show mp3 link
mp3url = p_soup.find('a', href = True, string = 'Full Show').get('href')
# get and save mp3
r = requests.get(mp3url)
with open(mp3named, 'wb') as f:
f.write(r.content)
print("Downloaded " + mp3basename + ".mp3.")
except AttributeError:
print(show_html + " did not exist as named.")
会起作用;只是不知道放在哪里或有更好的方式(更多Pythonic)。
理想情况下,如果mp3notes低于预期,则不会保存该show_html的备注或mp3,并且脚本将从下一个show_html页面开始。
由于我是Python的新手,所以请随意提供建议,使其更像Pythonic;我在这里学习!感谢。