我正在尝试使用WebClient类从youtube播放列表中提取网址。 我试过的是:
Dim wc As New WebClient
Dim html As String = wc.DownloadString("https://www.youtube.com/playlist?list=PL4_Dx88dpu7epfH6ybwqJpf9uL2tAl368")
Dim links As MatchCollection = Regex.Matches(html, "<a.*?href=""(.*?)"".*?>(.*?)</a>")
For Each match As Match In links
Dim matchUrl As String = match.Groups(1).Value
If matchUrl.StartsWith("/watch?v=") Then
RichTextBox1.AppendText(matchUrl)
End If
Next
但不幸的是,richtextbox保持为空。 我究竟做错了什么? 谢谢
答案 0 :(得分:0)
您的正则表达式与任何来源都不匹配。
查看从YouTube下载的来源,您会发现链接不是使用SELECT value
FROM
(
SELECT @i := @i + 1 AS rn,
JSON_UNQUOTE(JSON_EXTRACT(data_json, CONCAT('$.arr[',@i-1,']'))) AS value
FROM information_schema.tables
CROSS JOIN mytable
CROSS JOIN (SELECT @i := 0) r
) q
WHERE value LIKE '%hello%'
标签,而是JSON格式
您链接的网址中的一个片段:
<a>
基于对要查找视频网址的了解,可以使用正则表达式:
"title": {
"runs": [
{
"text": "Peter Tosh - Legalize It"
}
],
"accessibility": {
"accessibilityData": {
"label": "Peter Tosh - Legalize It by Bondade é Nosso Hábito 8 years ago 4 minutes, 46 seconds"
}
}
},
"index": {
"simpleText": "1"
},
"shortBylineText": {
"runs": [
{
"text": "Bondade é Nosso Hábito",
"navigationEndpoint": {
"clickTrackingParams": "CD0QxjQYACITCMn2heSTz-wCFUmw1Qod1hMDdw==",
"commandMetadata": {
"webCommandMetadata": {
"url": "/user/Bruno12170",
"webPageType": "WEB_PAGE_TYPE_CHANNEL",
"rootVe": 3611
}
},
"browseEndpoint": {
"browseId": "UCY1IJY2IYNVfD7R-JbgQGbQ",
"canonicalBaseUrl": "/user/Bruno12170"
}
}
}
]
},
"lengthText": {
"accessibility": {
"accessibilityData": {
"label": "4 minutes, 46 seconds"
}
},
"simpleText": "4:46"
},
"navigationEndpoint": {
"clickTrackingParams": "CD0QxjQYACITCMn2heSTz-wCFUmw1Qod1hMDdzIKcGxwcF92aWRlb1okVkxQTDRfRHg4OGRwdTdlcGZINnlid3FKcGY5dUwydEFsMzY4mgEDEPos",
"commandMetadata": {
"webCommandMetadata": {
"url": "/watch?v=j6QkVTx2d88&list=PL4_Dx88dpu7epfH6ybwqJpf9uL2tAl368&index=1",
"webPageType": "WEB_PAGE_TYPE_WATCH",
"rootVe": 3832
}
您可能要添加以下行:
Dim links As MatchCollection = Regex.Matches(html, "\/watch\?v=[\w*\\u0026=]*")
要删除,将“&”号放回网址