我正在尝试将此html文件中的11101973数字分配给变量,但仅需要一种无需任何其他信息即可获取该数字的方法:
Sub ListFiles()
Dim FSO As Object
Dim FSO_Folder As Object
Dim myPath$
Dim Obj
Dim Str$
Dim k1 As Long
myPath$ = "C:\Users\jim\Desktop\UIAutomation_VBA-master"
Set FSO = CreateObject("Scripting.FileSystemObject")
Set FSO_Folder = FSO.GetFolder(myPath)
For Each Obj In FSO_Folder.Files
Str$ = Obj.Path
Next Obj
End Sub
Sub ReadFiles()
Dim FSO As Object
Dim FSO_Folder As Object
Dim myPath$
Dim Obj
Dim Str$
Dim k1 As Long
myPath$ = "C:\Users\jim\Desktop\UIAutomation_VBA-master"
Set FSO = CreateObject("Scripting.FileSystemObject")
Set FSO_Folder = FSO.GetFolder(myPath)
Do
k1 = 0
For Each Obj In FSO_Folder.Files
k1 = k1 + AccessRight(Obj.Path)
Next Obj
DoEvents
Loop Until k1 = FSO_Folder.Files.Count
End Sub
Function AccessRight(ByVal FilePath As String) As Long
On Error GoTo The_end
AccessRight = 0
Open FilePath For Binary Lock Read Write As #1
Close #1
AccessRight = 1
The_end:
End Function
如果需要更多信息,则页面来源在这里:view-source:https://www.kickz.com/uk/jordan-basketball-retro-air-jordan-1-retro-high-og-black_varsity_red_sail_university_blue-107840036 任何帮助表示赞赏!
答案 0 :(得分:2)
beautifulsoup用于解析html元素而不是javascript变量。那里几乎没有JavaScript解析器,但是对于简单的任务,我更喜欢Regex
import requests, re
page = requests.get(url).text
theNumber = re.search(r'collectAskInput\((\d+)).group(1)
print(theNumber)
# 11101973
搜索其中的号码
onclick="return ProductDetails.collectAskInput(11101973)
答案 1 :(得分:0)
它在源代码中是一个脚本标记,您可以拉出字典形式的字符串。
import requests
import bs4
import json
url = 'https://www.kickz.com/uk/jordan-basketball-retro-air-jordan-1-retro-high-og-black_varsity_red_sail_university_blue-107840036'
response = requests.get(url)
soup = bs4.BeautifulSoup(response.text, 'html.parser')
scripts = soup.find_all('script')
jsonObj = None
for script in scripts:
if 'ec:addProduct' in script.text:
jsonStr = script.text
jsonStr = jsonStr.split("ga('ec:addProduct',")[1]
jsonStr = jsonStr.split(");")[-4]
jsonStr = jsonStr.replace("'", '"')
jsonObj = json.loads(jsonStr)
id_var = jsonObj['id']
print (id_var)
输出:
print (id_var)
107840036