我试图将youtube水印刮掉一个元素href,但我似乎无法抓住它。
如果我尝试
SELECT
t.Name,
t.Value,
max(case when t.minrn = 1 then t.timestamp end) AS EarliestTimestamp,
max(case when t.maxrn = 1 then t.timestamp end) AS LatestTimestamp
FROM
(SELECT
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY TIMESTAMP) as minrn,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY TIMESTAMP DESC) as maxrn,
Name,
Value
Timestamp
FROM YourTable) t
WHERE t.minrn = 1 or t.maxrn = 1
GROUP BY t.Name, t.Value
我得到了
def youtube_link(url):
youtube_page = requests.get(url, headers=headers)
soupdata = BeautifulSoup(youtube_page.text, 'html5lib')
video_row = soupdata.find_all('a', {'class': 'ytp-watermark'})
entries = video_row.get('href')
return entries
如果我尝试
'ResultSet' object has no attribute 'get'
我得到了
def youtube_link(url):
youtube_page = requests.get(url, headers=headers)
soupdata = BeautifulSoup(youtube_page.text, 'html5lib')
video_row = soupdata.find('a', {'class': 'ytp-watermark'})
entries = video_row.get('href')
return entries
如果我尝试
'NoneType' object has no attribute 'get'
我得到一个角色
def youtube_link(url):
youtube_page = requests.get(url, headers=headers)
soupdata = BeautifulSoup(youtube_page.text, 'html5lib')
video_row = soupdata.find('a', {'target': '_blank'})
entries = video_row.get('href')[24]
return entries
如果我尝试
's'
我得到了
def youtube_link(url):
youtube_page = requests.get(url, headers=headers)
soupdata = BeautifulSoup(youtube_page.text, 'html5lib')
video_row = soupdata.find('a', {'target': '_blank'})[24]
entries = video_row.get('href')
return entries
如果我尝试
24
我得到了
def youtube_link(url):
youtube_page = requests.get(url, headers=headers)
soupdata = BeautifulSoup(youtube_page.text, 'html5lib')
video_row = soupdata.find('a', {'target': '_blank'})[24:]
entries = video_row.get('href')
return entries
如果我尝试
unhashable type: 'slice'
我得到了
def panties():
from lxml import html
pan_url = 'http://www.panvideos.com'
shtml = requests.get(pan_url, headers=headers)
soup = BeautifulSoup(shtml.text, 'html5lib')
video_row = soup.find_all('div', {'class': 'video'})
def youtube_link(url):
youtube_page = requests.get(url, headers=headers)
soupdata = BeautifulSoup(youtube_page.text, 'html5lib')
video_row = soupdata.find('a', {'target': '_blank'})
entries = [{'text': div.get('href'),
} for div in video_row][24]
return entries
如果我尝试
'NavigableString' object has no attribute 'get'
我得到了
def youtube_link(url):
youtube_page = requests.get(url, headers=headers)
soupdata = BeautifulSoup(youtube_page.text, 'html5lib')
video_row = soupdata.find_all('a', {'class': 'ytp-title-link'})
entries = [{'text': div.get('href'),
} for div in video_row]
return entries
如果我使用铬检查并将鼠标悬停在水印上,我会
[]
但如果我使用inspect的搜索功能并输入_blank,我会得到
<a class="ytp-watermark yt-uix-sessionlink" target="_blank" aria-label="Watch on www.youtube.com" data-sessionlink="feature=player-watermark" href="https://www.youtube.com/watch?v=Xjww1pgKgnU" data-layer="7">
<svg xmlns:xlink="http://www.w3.org/1999/xlink" height="100%" version="1.1" viewBox="0 0 77 34" width="100%">
........
</svg>
</a>
这些都没有返回结果。我的语法错了吗?任何帮助将不胜感激
这是我的全部功能
<a class="ytp-title-link yt-uix-sessionlink" target="_blank" data-sessionlink="feature=player-title" href="https://www.youtube.com/watch?v=Xjww1pgKgnU">
<span class="ytp-title-playlist-icon" style="display: none;">
.....
</span>
<span>Packer Luther King Feat Mgp the Saw -BIEN MALA (Video Oficial)</span></a>
它获取了一个url,使用该url作为获取详细信息页面的方法,并从该页面获取该信息并将其返回。由于某种原因,链接返回为None。如果我尝试查找全部或发现它不会返回单个元素。但如果我寻找h1它会起作用。
编辑我尝试了不同的解析器
html.parser,lxml和html5lib
编辑:
我认为数据无法被删除,因为它来自媒体播放器。当我做的时候
def panties():
from lxml import html
pan_url = 'http://www.panvideos.com'
shtml = requests.get(pan_url, headers=headers)
soup = BeautifulSoup(shtml.text, 'html5lib')
video_row = soup.find_all('div', {'class': 'video'})
def youtube_link(url):
youtube_page = requests.get(url, headers=headers)
soupdata = BeautifulSoup(youtube_page.text, 'html5lib')
video_row = soupdata.find('a', {'class': 'ytp-title-link yt-uix-sessionlink'})
entries = [{'text': div.get('href'),
} for div in video_row]
return entries
entries = [{'text': div.h4.text,
'href': div.a.get('href'),
'tube': youtube_link(div.a.get('href')),
} for div in video_row][:1]
return entries
我正在寻找的数据没有显示出来。所以这不是我,我不认为这是一个错误或任何通过正常手段无法获得的东西。无法抓取链接标记元标记和其他一些标记。
答案 0 :(得分:0)
当我在课程中使用全部值时,我得到了href ...
if ( y = 1 ) {
a3 = a3 * -1;
c1 = c1 * -1;
}
如果你想使用findAll,你必须迭代条目。例如,创建自己的其他列表entries_final并执行此操作:
video_row = soupdata.find('a', {'class': 'ytp-watermark yt-uix-sessionlink'})
然后video_rows = soupdata.findAll('a', {'class': 'ytp-watermark yt-uix-sessionlink'})
entries_final = []
for row in video_rows:
entries_final.append(row.get('href'))