我有一个文本文件,它看起来像......
Blade Runner 2049 http://www.imdb.com/title/tt1856101
Kingsman: The Golden Circle http://www.imdb.com/title/tt4649466
The Crucifixion http://www.imdb.com/title/tt4181782/
我的代码在文本文件中找到所有行“http://www.imdb.com/title”,并复制“http://www.imdb.com/title”之前的前一个单词(电影名称)并将其粘贴到Excel单元格中。
Sub GetText()
Dim fName As String, Word1 As String, Word2 As String, i As Long, s As String, st As String
fName = "C:\Test\test1.txt"
st = "http://www.imdb.com/title"
Open fName For Input As #1
Do Until EOF(1)
Word1 = Word2
Input #1, Word2
If (Word2 = st & ">") Or (Word2 Like st & "/*") Then
If Trim$(Word1) <> "" Then i = i + 1: Cells(i, 1) = Word1
ElseIf Word2 Like "* " & st & "/*" Then
Word1 = Trim$(Split(Word2)(1))
If Trim$(Word1) <> "" Then i = i + 1: Cells(i, 1) = Word1
End If
Loop
Close #1
End Sub
但是此代码仅粘贴电影名称中的第一个单词。如何更改以粘贴完整的电影名称?
答案 0 :(得分:3)
一种非常简单的方法是使用Split()
function。
Sub Test()
Dim OrigStr$, YourMovie$
OrigStr = "Kingsman: The Golden Circle http://www.imdb.com/title/tt4649466"
YourMovie = Split(OrigStr, " http:")(0)
MsgBox YourMovie
End Sub
函数末尾的(0)
表示您希望整个字符串 previous 到找到的单词。相反,使用(1)
表示在找到的单词(“http:”)的第一次迭代之后需要字符串,(2)
表示在该工作的第二次迭代之后的字符串等等。
请注意:您仍然可以使用Split()
而不使用(i)
,(Split()
,而不是Split()(i)
)。当您使用此方法时,实际上是将值返回到数组而不是字符串。
如果您要将值返回到数组,则上面是另一个示例:
Sub Test()
Dim OrigStr$, OrigStrArr$(), YourMovie$
OrigStr = "Kingsman: The Golden Circle http://www.imdb.com/title/tt4649466"
OrigStrArr = Split(OrigStr, " http:")
YourMovie = OrigStrArr(0)
MsgBox YourMovie
End Sub
答案 1 :(得分:1)
好像你可以在Excel中打开文件并删除URL部分(未测试):
Workbooks.Open "C:\Test\test1.txt"
Cells.Replace " http://www.imdb.com/title/*", "", xlPart
同样,只获取网址:
Cells.Replace "* http://www.imdb.com/title/", "http://www.imdb.com/title/", xlPart
答案 2 :(得分:1)
我会像在工作表中那样解析它:
Sub dural()
Dim st As String, s As String, MovieName As String
st = "http://www.imdb.com/title"
s = "Blade Runner 2049 http://www.imdb.com/title/tt1856101"
MovieName = ""
If InStr(1, s, st) > 0 Then
With Application.WorksheetFunction
MovieName = Left(s, .Find(st, s) - 1)
End With
End If
MsgBox MovieName
End Sub
答案 3 :(得分:1)
这个使用正则表达式:
Sub GetText()
Dim fName As String
Dim i As Long
Dim FileContents As String
Dim collMatches As Collection
fName = "C:\Test\test1.txt"
Open fName For Input As #1
FileContents = Input(LOF(1), 1)
Close 1
Set collMatches = GetRegexMatches(FileContents, "^.*(?=http)")
Debug.Print collMatches.Count
For i = 1 To collMatches.Count
Cells(i, 1) = collMatches(i)
Next i
End Sub
Function GetRegexMatches(inputstring As String, SearchPattern As String, _
Optional boolIgnoreCase As Boolean = True, Optional boolGlobal As Boolean = True, Optional boolMultiline As Boolean = True, _
Optional UniqueMatches As Boolean = False) As Collection
Dim Regex As Object
Dim rgxMatch As Object
Dim rgxMatches As Object
Dim collMatches As Collection
Dim collUniqueMatches As Collection
Set Regex = CreateObject("vbscript.regexp")
With Regex
'search for any integer matches
'"\d+" is the same as "[0-9]+"
.Pattern = SearchPattern
.IgnoreCase = boolIgnoreCase
'Find all matches, not just the first
.Global = boolGlobal
'^ and $ work per-line, not just at begin and end of file
.MultiLine = boolMultiline
'built-in test for matches
Set collMatches = New Collection
Set collUniqueMatches = New Collection
If .test(inputstring) Then
'if matches, create a collection of them
Set rgxMatches = .Execute(inputstring)
For Each rgxMatch In rgxMatches
collMatches.Add rgxMatch
On Error Resume Next
collUniqueMatches.Add rgxMatch, rgxMatch
On Error GoTo 0
Next rgxMatch
End If
End With
If UniqueMatches Then
Set GetRegexMatches = collUniqueMatches
Else
Set GetRegexMatches = collMatches
End If
Set Regex = Nothing
End Function