我还是初学者,但我可以阅读简单的html结构。
然而,在网站https://stockrow.com/AAPL/financials/income/annual上,我尝试使用 xmlhttprequest 将数据提取到excel中,但源数据缺少包含所有关键值的重要表格。 当我检查网站时,我可以看到整个HTML结构。
这是我得到的源数据:
<!DOCTYPE html>
<html lang="en">
<head>
<link rel="apple-touch-icon-precomposed" sizes="57x57"
href="/favicons/apple-touch-icon-57x57.png" />
<link rel="apple-touch-icon-precomposed" sizes="114x114"
href="/favicons/apple-touch-icon-114x114.png" />
<link rel="apple-touch-icon-precomposed" sizes="72x72"
href="/favicons/apple-touch-icon-72x72.png" />
<link rel="apple-touch-icon-precomposed" sizes="144x144"
href="/favicons/apple-touch-icon-144x144.png" />
<link rel="apple-touch-icon-precomposed" sizes="60x60"
href="/favicons/apple-touch-icon-60x60.png" />
<link rel="apple-touch-icon-precomposed" sizes="120x120"
href="/favicons/apple-touch-icon-120x120.png" />
<link rel="apple-touch-icon-precomposed" sizes="76x76"
href="/favicons/apple-touch-icon-76x76.png" />
<link rel="apple-touch-icon-precomposed" sizes="152x152"
href="/favicons/apple-touch-icon-152x152.png" />
<link rel="icon" type="image/png" href="/favicons/favicon-196x196.png"
sizes="196x196" />
<link rel="icon" type="image/png" href="/favicons/favicon-96x96.png"
sizes="96x96" />
<link rel="icon" type="image/png" href="/favicons/favicon-32x32.png"
sizes="32x32" />
<link rel="icon" type="image/png" href="/favicons/favicon-16x16.png"
sizes="16x16" />
<link rel="icon" type="image/png" href="/favicons/favicon-128.png"
sizes="128x128" />
<meta name="application-name" content="stockrow.com"/>
<meta name="msapplication-TileColor" content="#FFFFFF" />
<meta name="msapplication-TileImage" content="/favicons/mstile-144x144.png"
/>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
<link href="https://code.cdn.mozilla.net/fonts/fira.css" rel="stylesheet" type="text/css" />
<script src="https://www.google.com/recaptcha/api.js"></script>
<script src="https://cdn.ravenjs.com/3.15.0/raven.min.js"></script>
<script>Raven.config('https://3ce523a8252c436f83c6fc423b340c0a@sentry.io/144901').install()</script>
<meta name="csrf-param" content="authenticity_token" />
<link rel="stylesheet" media="screen" href="/packs/stockrow-aa9c6f09f554179248530de2e33baa9b.css" />
<script src="/packs/stockrow-a35b20c51d525016f7c7.js"></script>
<script async id="_ck_381101" src="https://forms.convertkit.com/381101?v=7"></script>
我不知道如何解决这个问题,所以我想我会尝试堆栈溢出。
答案 0 :(得分:0)
如果您只需要网站显示的数据,您实际上可以使用VBA打开IE实例并要求IE为您搜索数据。这有点像黑客,但它会完成这项工作。
基本上,请使用浏览器检查网站,并查看哪些元素包含您想要的数据。在VBA脚本中,您可以要求VBA收集元素中包含的数据。
答案 1 :(得分:0)
仔细查看HTML页面,您会发现可以下载xlsx。实际上,您只需复制与元素的href相关联的URL,然后将其传递给URLMon即可直接下载。
摘要:
<a class="button hollow expanded" href="/api/companies/AAPL/financials.xlsx?dimension=MRY&section=Income Statement" target="_blank">Export to Excel (.xlsx)</a>
图片:
href是相对的,因此您需要将主机域放在最前面。
VBA:
Option Explicit
#If VBA7 And Win64 Then
Private Declare PtrSafe Function URLDownloadToFile Lib "urlmon" _
Alias "URLDownloadToFileA" ( _
ByVal pCaller As LongPtr, _
ByVal szURL As String, _
ByVal szFileName As String, _
ByVal dwReserved As LongPtr, _
ByVal lpfnCB As LongPtr _
) As Long
#Else
Private Declare Function URLDownloadToFile Lib "urlmon" _
Alias "URLDownloadToFileA" ( _
ByVal pCaller As Long, _
ByVal szURL As String, _
ByVal szFileName As String, _
ByVal dwReserved As Long, _
ByVal lpfnCB As Long _
) As Long
#End If
Public Const BINDF_GETNEWESTVERSION As Long = &H10
Public Const folderName As String = "C:\Users\HarrisQ\Desktop\info.xlsx" '<=Change as required
Public Sub downloadPDF()
Dim ret As Long
ret = URLDownloadToFile(0, "https://stockrow.com/api/companies/AAPL/financials.xlsx?dimension=MRY&section=Income Statement", folderName, BINDF_GETNEWESTVERSION, 0)
End Sub