所以我试图从网站上抓取用户名并遵循此教程
https://www.youtube.com/watch?v=FpAvBOhDrYk第一部分
https://www.youtube.com/watch?src_vid=FpAvBOhDrYk第二部分
并关注所有内容,但无法使其正常运行,但这是我使用的vb.net代码
导入System.Text.RegularExpressions
Public Class Form1
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim Request As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("http://statigr.am/tag/anime")
Dim response As System.Net.HttpWebResponse = Request.GetResponse
Dim rs As System.IO.StreamReader = New System.IO.StreamReader(response.GetResponseStream())
Dim rssourcecode As String = rs.ReadToEnd
'<a href="/hannahotaku">hannahotaku</a>
Dim r As New System.Text.RegularExpressions.Regex("<a href=""/.*"">hannahotaku</a>")
Dim matches As MatchCollection = r.Matches(rssourcecode)
For Each itemcode As Match In matches
ListBox1.Items.Add(itemcode.Value.Split("""").GetValue(1))
Next
End Sub End Class
你可以看到我正在使用网站的statigram 我试图刮掉的来源是
<a href="/hannahotaku">hannahotaku</a>
请让我知道我做错了什么,因为我想刮掉 部分在
之间(<a href="/**whatever username here**"></a>)
答案 0 :(得分:0)
如果您想捕获整个链接:
(<a href="\/.+?">hannahotaku<\/a>)
如果您想捕获用户名:
<a href="\/(.+?)">hannahotaku<\/a>
从我所看到的,它的VB.net可能是:
<a href=""/(.+?)"">hannahotaku</a>
使用延迟匹配(+?
)确保它只匹配所需的内容,没有额外的内容,以及加号以确保其中至少有一个单字母用户名,并且&# 39;不完全是空的。
P.S。我对vb.net不是很熟悉,所以如果有一些改编要做,请告诉我。
<强> DEMO 强>
答案 1 :(得分:0)
请改用此正则表达式:
"<div><div>([^<]+)</div>"
在for循环中,使用itemcode.Groups(1).Value
代替itemcode.Value.Split("""").GetValue(1)
。这将为您提供div标签之间的部分。
要检索匹配项,请尝试将它们放入文件中:
Imports System.Text.RegularExpressions
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim Request As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("http://statigr.am/tag/anime")
Dim response As System.Net.HttpWebResponse = Request.GetResponse
Dim rs As System.IO.StreamReader = New System.IO.StreamReader(response.GetResponseStream())
Dim rssourcecode As String = rs.ReadToEnd
Dim r As New System.Text.RegularExpressions.Regex("<div><div>([^<]+)</div>")
Dim matches As MatchCollection = r.Matches(rssourcecode)
Using Dim addInfo = File.CreateText("c:\Textfile.txt")
For Each itemcode As Match In matches
addInfo.WriteLine(itemcode.Groups(1).Value)
Next
End Using
End Sub End Class