我有以下代码,第一次加载速度很慢。 CSV文件大约是4mb 16000行。
If Session("tb") Is Nothing Then
Dim str As String()
If (IsNothing(Cache("csvdata"))) Then
str = File.ReadAllLines(Server.MapPath("~/test/feed.csv"))
Cache.Insert("csvdata", str, Nothing, DateTime.Now.AddHours(12), TimeSpan.Zero)
Else
str = CType(Cache("csvdata"), Array)
End If
Dim dt As New DataTable
dt.Columns.Add("Shape", GetType(System.String))
dt.Columns.Add("Weight", GetType(System.Double))
dt.Columns.Add("Color", GetType(System.String))
dt.Columns.Add("Clarity", GetType(System.String))
dt.Columns.Add("Price", GetType(System.Int32))
dt.Columns.Add("CutGrade", GetType(System.String))
For i As Integer = 1 To str.Length - 1
Dim pattern As String = ",(?=([^""]*""[^""]*"")*[^""]*$)"
Dim rgx As New Regex(pattern)
Dim t As String = rgx.Replace(str(i), "\")
Dim s As String() = t.Split("\"c)
Dim pr As Int32 = CType(s(5), Int32)
Dim fpr As Int32
Dim rate As Double
Select Case pr
Case Is < 300
rate = 2
Case 301 To 600
rate = 1.7
Case Is > 600
rate = 1.16
End Select
fpr = Math.Round(pr * rate)
Dim a As String() = {s(1), s(2), s(3), s(4), fpr, s(40)}
dt.Rows.Add(a)
Next
Session("tb") = dt
ListView1.DataSource = dt
ListView1.DataBind()
Else
Dim x As DataTable = CType(Session("tb"), DataTable)
ListView1.DataSource = x
ListView1.DataBind()
End If
csv文件被缓存,我认为这可以与所有人共享。 (一个人在12小时内加载一次) 创建Session后,页面加载速度也很快。 因此,创建Datatable似乎是一个缓慢的过程。 这是第一次处理数据表,我确信有人可以指出我做错了什么。
谢谢
更新
我已将Cache更改为原始Datatable而不是CSV文件。 它现在加载速度很快,但我想知道这是不是一个坏主意。
Cache.Insert("csvdata", dt, Nothing, DateTime.Now.AddHours(12), TimeSpan.Zero)
一旦它存储在Cache中,我就可以使用Linq对它运行Query。
SAMPLE CSV 前3行
Supplier ID,Shape,Weight,Color,Clarity,Price / Carat,Lot Number,Stock Number,Lab,Cert #,Certificate Image,2nd Image,Dimension,Depth %,Table %,Crown Angle,Crown %,Pavilion Angle,Pavilion %,Girdle Thinnest,Girdle Thickest,Girdle %,Culet Size,Culet Condition,Polish,Symmetry,Fluor Color,Fluor Intensity,Enhancements,Remarks,Availability,Is Active,FC-Main Body,FC- Intensity,FC- Overtone,Matched Pair,Separable,Matching Stock #,Pavilion,Syndication,Cut Grade,External Url
9349,Round,1.74,F,VVS1,13650.00,,IM-95-188-243,ABC,11228,,,7.81|7.85|4.62,59.00,62.00,34.00,13.00,,,Medium,,0,None,,Excellent,Very Good,Blue,Medium,,"",Not Specified,Y,,,,False,True,,,,Very Good,http://www.test/teste.
9949,Round,1.00,I,VVS1,6059.00,,IM-95-189-C021,ABC,212197,,,6.37|6.42|3.96,61.90,54.00,34.50,16.00,,,Thin,Slightly Thick,0,None,,Excellent,Good,,None,,"Additional pinpoints are not shown.",Guaranteed Available,Y,,,,False,True,,,,Very Good,http://www.test/test.
答案 0 :(得分:0)
使用TextFieldParser来阅读CSV,而不是自己拆分字符串。
此外,如果您使用List(Of CustomClass),其中CustomClass具有Shape,Weight,Color等属性,您可以避免DataTable的不必要开销,并且您仍然可以对List执行LINQ查询。
原谅我的C#,我没有在这个盒子上安装VB.NET。
public class Gemstone
{
public string Shape { get; set; }
public double Weight { get; set; }
public string Color { get; set; }
}
static void Main(string[] args)
{
TextFieldParser textFieldParser = new TextFieldParser("data.txt");
textFieldParser.Delimiters = new string[] {","};
textFieldParser.ReadLine(); // skip header line
List<Gemstone> list = new List<Gemstone>(16000); // allocate the list with your best calculated guess of its final size
while(!textFieldParser.EndOfData)
{
string[] fields = textFieldParser.ReadFields();
Gemstone gemstone = new Gemstone();
gemstone.Shape = fields[1];
gemstone.Weight = Double.Parse(fields[2]);
gemstone.Color = fields[3];
list.Add(gemstone);
}
答案 1 :(得分:0)
仅供参考我刚刚找到了整个TextFieldParser的东西,我做了大量的文本文件解析,所以我测试了它....
在一个11mb的文件中,有大约5200行和300列。
这是我在放入数据表时使用的速度的25%。当我删除数据表代码时,它大约是速度的15%:
Dim DataTable As New DataTable()
Dim StartTime As Long = Now.Ticks
Dim Reader As New FileIO.TextFieldParser("file.txt")
Reader.TextFieldType = FileIO.FieldType.Delimited
Reader.SetDelimiters(vbTab)
Reader.HasFieldsEnclosedInQuotes = False
Dim Header As Boolean = True
While Not Reader.EndOfData
Dim Fields() As String = Reader.ReadFields
If Header Then
For I As Integer = 1 To 320
DataTable.Columns.Add("Col" & I)
Next
Header = False
Else
If Mid(Fields(0), 1, 1) <> "#" Then DataTable.Rows.Add(Fields)
End If
End While
Debug.Print((Now.Ticks - StartTime) / 10000 & "ms")
Dim DataTable2 As New DataTable()
StartTime = Now.Ticks
For I As Integer = 1 To 320
DataTable2.Columns.Add("Col" & I)
Next
For Each line As String In System.IO.File.ReadAllLines("file.txt")
Dim NVP() As String = Split(line, vbTab)
If Mid(line, 1, 1) <> "#" Then DataTable2.Rows.Add(NVP)
Next
Debug.Print((Now.Ticks - StartTime) / 10000 & "ms")
删除了可数据代码:
Dim StartTime As Long = Now.Ticks
Dim Reader As New FileIO.TextFieldParser("file.txt")
Reader.TextFieldType = FileIO.FieldType.Delimited
Reader.SetDelimiters(vbTab)
Reader.HasFieldsEnclosedInQuotes = False
Dim Header As Boolean = True
While Not Reader.EndOfData
Dim Fields() As String = Reader.ReadFields
End While
Debug.Print((Now.Ticks - StartTime) / 10000 & "ms")
StartTime = Now.Ticks
For Each line As String In System.IO.File.ReadAllLines("file.txt")
Dim NVP() As String = Split(line, vbTab)
Next
Debug.Print((Now.Ticks - StartTime) / 10000 & "ms")
有点让我感到惊讶,但我想数据表有更多的功能。我发现另一件我永远不会使用的新东西:(