嗨所有下面的代码是如何比较两个文本文件中的内容,并在文件中记录工作正常,但我的问题是文件有很多行(80000上)我的代码工作非常慢,我不能接受它。请帮我个意思
Public Class Form1
Const TEST1 = "D:\a.txt"
Const TEST2 = "D:\b.txt"
Public file1 As New Dictionary(Of String, String)
Public file2 As New Dictionary(Of String, String)
Public text1 As String()
Public i As Integer
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
'Declare two dictionaries. The key for each will be the text from the input line up to,
'but not including the first ",". The valus for each will be the entire input line.
'Dim file1 As New Dictionary(Of String, String)
'Dim file2 As New Dictionary(Of String, String)
'Dim text1 As String()
For Each line As String In System.IO.File.ReadAllLines(TEST1)
Dim part() As String = line.Split(",")
file1.Add(part(0), line)
Next
For Each line As String In System.IO.File.ReadAllLines(TEST2)
Dim part() As String = line.Split(",")
file2.Add(part(0), line)
Next
' AddText("The following lines from " & TEST2 & " are also in " & TEST1)
For Each key As String In file1.Keys
If file2.ContainsKey(key) Then
TextBox1.Text &= (file1(key)) & vbCrLf
MsgBox(file2(key))
Label1.Text = file1(key)
Else
TextBox2.Text &= (file1(key)) & vbCrLf
End If
Next
text1 = TextBox1.Lines
IO.File.WriteAllLines("D:\Same.txt", text1)
text1 = TextBox2.Lines
IO.File.WriteAllLines("D:\Differrent.txt", text1)
End Sub
答案 0 :(得分:2)
我要改变的第一件事是使用词典。我会使用Hashset。 See HashSet versus Dictionary
然后我会改变ReadAllLines循环。 ReadAllLines在开始循环之前加载内存中的每一行,而ReadLines不读取所有行,但您可以立即开始在您的行上工作。
见What's the fastest way to read a text file line-by-line?
第三点是切换读取文件的顺序。首先读取TEST2文件,然后读取TEST1。这是因为当您加载TEST1行时,您可以立即检查file2 Hashset是否包含该键,并在找到的字符串列表中添加找到的行,而在未找到的字符串列表中找不到该行。
Dim TEST1 = "D:\temp\test3.txt"
Dim TEST2 = "D:\temp\test6.txt"
Dim file2Keys As New Hashset(Of String)
For Each line As String In System.IO.File.ReadLines(TEST2)
Dim parts = line.Split(",")
file2Keys.Add(parts(0))
Next
Dim listFound As New List(Of String)()
Dim listNFound= New List(Of String)()
For Each line As String In System.IO.File.ReadLines(TEST1)
Dim parts = line.Split(",")
If file2Keys.Contains(parts(0)) Then
listFound.Add(line)
Else
listNFound.Add(line)
End If
Next
IO.File.WriteAllText("D:\temp\Same.txt", String.Join(Environment.NewLine, listFound.ToArray()))
IO.File.WriteAllText("D:\temp\Differrent.txt", String.Join(Environment.NewLine, listNFound.ToArray()))