将HTML表中的数据导入Access数据库

时间:2013-05-30 16:04:02

标签: html sql api ms-access html-parsing

如何从HTML表格动态填充数据库(例如,从市场数据S& P 500)?

我有Yahoo! Financial的帐户。在帐户中,我可以用HTML查看财务数据。

我需要一个简单的工具来从HTML表中填充数据库(Access)。我在哪里可以找到这样的工具?

3 个答案:

答案 0 :(得分:1)

您可以从Yahoo历史数据导出为CSV,并将Access中的csv文件直接链接为MS Access Table。 http://office.microsoft.com/en-ca/access-help/import-or-link-to-data-in-a-text-file-HA001232227.aspx

如果您想处理html页面源,则此链接可能有所帮助。

http://www.access-programmers.co.uk/forums/showthread.php?p=1145646

答案 1 :(得分:0)

ACE / Jet OLEDB可用于直接从HTML文件导入数据。例如,给定现有访问表[DataFromHtml]

ID  LastName
--  --------

和包含表格的HTML文件

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
    <title>
        Test Data
    </title>
</head>
<body>
<table>
    <tr>
        <th>
            ID
        </th>
        <th>
            LastName
        </th>
    </tr>
    <tr>
        <td>
            1
        </td>
        <td>
            Thompson
        </td>
    </tr>
    <tr>
        <td>
            2
        </td>
        <td>
            O'Rourke
        </td>
    </tr>
</table>
</body>
</html>

以下VBA代码将清除Access表(DELETE FROM),然后将HTML表数据导入其中。

Sub ImportFromHtml()
Const LocalTableName = "DataFromHtml"
Dim con As Object, rstHtml As Object, fld As Object, _
        cdb As DAO.Database, rstAccdb As DAO.Recordset, _
        recCount As Long

Set con = CreateObject("ADODB.Connection")
con.Open _
        "Provider=Microsoft.ACE.OLEDB.12.0;" & _
        "Data Source=C:\Users\Gord\Documents\table.htm;" & _
        "Extended Properties=""HTML Import;HDR=YES;IMEX=1"";"
Set rstHtml = CreateObject("ADODB.Recordset")
rstHtml.Open "SELECT * FROM [Test Data]", con

Set cdb = CurrentDb
cdb.Execute "DELETE FROM [" & LocalTableName & "]", dbFailOnError
Set rstAccdb = cdb.OpenRecordset(LocalTableName, dbOpenTable)

recCount = 0
Do While Not rstHtml.EOF
    recCount = recCount + 1
    rstAccdb.AddNew
    For Each fld In rstHtml.Fields
        rstAccdb.Fields(Trim(fld.Name)).Value = Trim(fld.Value)
    Next
    Set fld = Nothing
    rstAccdb.Update
    rstHtml.MoveNext
Loop

rstAccdb.Close
Set rstAccdb = Nothing
Set cdb = Nothing

rstHtml.Close
Set rstHtml = Nothing
con.Close
Set con = Nothing

Debug.Print recCount & " record(s) imported"
End Sub

答案 2 :(得分:0)

假设Gord Thompsons解决方案的HTML结构是使用ADO的一种非常快速的方法。

Public Function GetTitle(ByVal HtmlFile As String) As String
    Dim DOM As Object

    Set DOM = CreateObject("MSXML2.DOMDocument")
    DOM.Load HtmlFile
    GetTitle = DOM.getElementsByTagName("title")(0).Text
End Function

Public Sub Import(ByVal Filename As String, ByVal Tablename As String)
    Dim SQL As String
    Dim Title As String
    On Error GoTo Import_Error

    Title = GetTitle(Filename)

    CurrentProject.Connection.Execute "DROP TABLE " & Tablename

    SQL = "SELECT * INTO " & Tablename & _
          " FROM [HTML Import;HDR=YES;IMEX=1;DATABASE=" & Filename & "].[" & Title & "]"
    CurrentProject.Connection.Execute SQL

    Exit Sub
Import_Error:
End Sub

因此,您要将HTML文件“ C:\ SomeFolder \ MyFile.html”放入表“ MyImport”中,请使用:

Import "C:\SomeFolder\MyFile.html", "MyImport"

一个附加提示:如果HTML文件的标题具有特殊字符,如。或:,导入将失败。您必须尝试找出哪些特殊字符有问题,哪些没有问题。