我使用TextFieldParser
类来读取逗号分隔值(.csv)文件。此文件中的字段用双引号括起,如"Field1","Field2"
。
因此,要读取文件,我将HasFieldsEnclosedInQuotes
对象的TextFieldParser
属性设置为true。但是当任何字段在开头包含双引号(“+”)时,我会收到MalformedLineException
的错误。
示例:""Field2"with additional"
在这里,我应该看到"Field2" with additional
作为输出。
但是,如果"
除了第一个位置以外的任何地方,那么它可以正常工作。
与"Field2 "with" additional"
一样的线条非常合适并且给了我Field2 "with" additional
作为输出。
有没有人有同样的问题?有什么办法可以解决这个问题吗?
这是我的代码:
Private Sub ReadTextFile(ByVal txtFilePath As String)
Dim myReader As tfp = New Microsoft.VisualBasic.FileIO.TextFieldParser(txtFilePath)
myReader.Delimiters = New String() {","}
myReader.TextFieldType = FileIO.FieldType.Delimited
myReader.HasFieldsEnclosedInQuotes = True
myReader.TrimWhiteSpace = True
Dim currentRow As String()
Dim headerRow As Integer = 0
While Not myReader.EndOfData
Try
currentRow = myReader.ReadFields()
'Read Header
If (headerRow = 0) Then
'Do work for Header Row
headerRow += 1
Else
'Do work for Data Row
End If
Catch ex As Exception
Dim errorline As String = myReader.ErrorLine
End Try
End While
End Sub
这是我在csv文件中的数据:
"Column1","Column2","Column3" "Value1","Value2",""A" Block in Building 123"
答案 0 :(得分:9)
您的示例""A" Block"
格式错误;因此,TextFieldParser完全有权拒绝它。 CSV standard说:
7. If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
如果您正确编码数据,即......
"Column1","Column2","Column3"
"Value1","Value2","""A"" Block in Building 123"
... TextFieldParser工作正常,并正确返回"A" Block in Building 123
。
因此,第一步是告诉生成CSV文件的人创建一个有效的CSV文件,而不是看起来像CSV的东西。
如果你不能这样做,你可能想要在文件中进行两次传递:
答案 1 :(得分:-1)
[原始答案]
试试这个:
using System;
using System.IO;
using System.Linq;
class Test
{
static void Main()
{
var file = "Test.txt";
var r = File.ReadAllLines(file)
.Select((i, index) => new { Line = index, Fields = i.Split(new char[] { ',' }) });
// header
var header = r.First();
// do work for header
for (int j = 0; j < header.Fields.Count(); j++)
{
Console.Write("{0} ", header.Fields[j].Substring(1, header.Fields[j].Length-2));
}
Console.WriteLine();
var rows = r.Skip(1).ToList();
// do work for rows
for (int i = 0; i < rows.Count; i++)
{
for (int j = 0; j < rows[i].Fields.Count(); j++)
{
Console.Write("{0} ", rows[i].Fields[j].Trim(new[] { '"' }));
}
Console.WriteLine();
}
}
}
注意:我在C#中发帖,因为问题仍然被标记了。
由于C#标记已消失,请参阅http://converter.telerik.com/以获取将代码转换为VB的帮助。
[更新回答]
尝试不同的方法(这次,在VB.Net中):
Imports System
Imports System.IO
Imports System.Linq
Class Test
Public Shared Sub Main()
Dim file__1 = "Test.txt"
Dim r = File.ReadAllLines(file__1).[Select](Function(i, index) New With { _
.Line = index, _
.Fields = i.Substring(1, i.Length - 2).Split(New String() {""","""}, StringSplitOptions.None) _
})
' header
Dim header = r.First()
' do work for header
For j As Integer = 0 To header.Fields.Count() - 1
Console.Write("{0} ", header.Fields(j))
Next
Console.WriteLine()
Dim rows = r.Skip(1).ToList()
' do work for rows
For i As Integer = 0 To rows.Count - 1
For j As Integer = 0 To rows(i).Fields.Count() - 1
Console.Write("{0} ", rows(i).Fields(j))
Next
Console.WriteLine()
Next
End Sub
End Class