如何直接读取dbf / dbt文件?

时间:2017-08-08 11:27:24

标签: .net vb.net parsing dbase

有没有人知道如何直接从dBase .DBF / .DBT文件集中读取数据?

详细说明:

我正在尝试根据dbf / dbt文件的dBase规范编写解析器。

DBF文件相对简单,MEMO字段中的值是顺序块号,该字段的数据应该从dbt文件开始。

DBT文件未在规范中深入定义。当我解析DBT文件时(根据规范由顺序块组成[每个大小为512字节],块0是标题块),我看到记录数据之间散布着额外的字节数据(有些看起来像“垃圾”二进制数据,有些看起来像db中的其他表名。由于一些额外的数据包含字母/数字,它只是试图读取块的记录数据几乎不可能。我可以看到的规格中没有明确的这些奇怪数据的定义。我假设它可能是某种标题数据,但它似乎没有固定的字节宽度,甚至没有出现在每个块的相同位置。

DBF文件的备注字段中的顺序块编号似乎并不总是与实际数据对齐。即dbf中的记录2表示它从块2开始,但实际上在dbt文件中它从块6开始。

有没有人知道有关DBT文件结构的更多信息?也许是我缺少的东西?

代码示例(VB.Net):

' Holds information about data in the header .dbf file.
Public Class HeaderFileClass
    Public Property AccountNo As String     ' 6 bytes
    Public Property BlockNumber As String   '10 bytes
    Public Property DateInfo As String      '8 bytes
    Public Property EditBy As String        '3 bytes
    Public Sub New()
        AccountNo = String.Empty
        BlockNumber = String.Empty
        DateInfo = String.Empty
        EditBy = String.Empty
    End Sub
    Public Sub New(newAcctNo As String, newBlockNo As String, newDateInf As String, newEditBy As String)
        AccountNo = newAcctNo
        BlockNumber = newBlockNo
        DateInfo = newDateInf
        EditBy = newEditBy
    End Sub
End Class
' Strips a byte array of anything but alpha-numerics, space, or line feed.
Private Function CleanBytes(ByRef bytes As Byte()) As Byte()
    Dim newBytes As Byte()
    Dim BLOCKSIZE As Integer = 512
    Dim j As Integer = 0
    Dim strOut As String = String.Empty
    ReDim newBytes(BLOCKSIZE)
    newBytes.Initialize()
    For Each i As Byte In bytes
        Dim intVal As Integer = Convert.ToInt32(i)
        If (intVal >= 32 And intVal <= 126) Or intVal = 10 Then
            newBytes(j) = i
            j += 1
        End If
    Next
    Return newBytes
End Function
Private Sub ParseFile()
    Dim fileName As String = "C:\dbbackup\Schalls\Schalls_CleanLegacy\Schall_Clean_DATA\PATNOTES"       ' data location.
    Dim BLOCKSIZE As Integer = 512                  ' Default block size.
    Dim bytes As Byte() = Nothing                   ' bytes to be read from dbt file.
    Dim buffer As Char()                            ' buffer to use for reading dbf file.
    Dim hList As New List(Of HeaderFileClass)       ' DBF header data storage.
    Dim lstData As New List(Of Byte())              ' DBT block data storage.
    ReDim buffer(28)                                ' Set size of buffer array.

    'header file load
    Using inFile As New StreamReader(File.Open(fileName & ".DBF", FileMode.Open))
        ' read DBF header lines.
        inFile.ReadLine()
        inFile.ReadLine()

        ' read DBF data.
        While inFile.Read(buffer, 0, 28) > 0
            Dim strBuf As New String(buffer)
            Dim acctNo As String = strBuf.Substring(0, 7)
            Dim blockNo As String = strBuf.Substring(7, 10).Trim
            Dim dateInfo As String = strBuf.Substring(17, 8)
            Dim editBy As String = strBuf.Substring(25, 3)
            hList.Add(New HeaderFileClass(acctNo, blockNo, dateInfo, editBy))
        End While
    End Using


    'memo file load
    Using inFile As New BinaryReader(File.Open(fileName & ".DBT", FileMode.Open))
        ' read data sequentially by blocksize.
        Do
            bytes = inFile.ReadBytes(BLOCKSIZE)
            If bytes.Length > 0 Then
                lstData.Add(bytes)
            End If
        Loop While bytes.Length > 0
    End Using

    If hList.Count > 2 Then
        For i As Integer = 0 To hList.Count - 2
            Dim h As HeaderFileClass = hList(i)             ' get data for the current record from the header file data. (contains block number to start)
            Dim h2 As HeaderFileClass = hList(i + 1)        ' get the next data for the current record. (contains next starting block number)
            Dim intFrom As Integer = CInt(h.BlockNumber)    ' starting block number.
            Dim intTo As Integer = CInt(h2.BlockNumber)     ' next record's starting block number.
            Dim sbStr As New System.Text.StringBuilder      ' output string.

            ' read the bytes, ensure they are text data, 
            For j As Integer = intFrom To intTo - 1
                sbStr.Append(System.Text.Encoding.ASCII.GetString(CleanBytes(lstData(j))))
            Next
            Debug.Print(sbStr.ToString)
        Next
    End If
End Sub

1 个答案:

答案 0 :(得分:0)

您需要使用OLEDB,例如:

Imports System.Data.OleDb
Public Class Form1
    Private FileName As String = IO.Path.Combine(Application.StartupPath, "Customer.dbf")
    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
        If IO.File.Exists(FileName) Then
            Dim Builder As New OleDbConnectionStringBuilder With
                {
                    .DataSource = IO.Path.GetDirectoryName(FileName),
                    .Provider = "Microsoft.Jet.OLEDB.4.0"
                }
            Builder.Add("Extended Properties", "dBase III")
            Using cn As New OleDb.OleDbConnection With {.ConnectionString = Builder.ConnectionString}
                Using cmd As New OleDbCommand With {.Connection = cn}
                    cmd.CommandText = "SELECT * FROM " & IO.Path.GetFileName(FileName)
                    cn.Open()
                    Dim dt As New DataTable
                    dt.Load(cmd.ExecuteReader)
                End Using
            End Using
        End If
    End Sub
End Class