我有一个需要转换为csv文件的文本文件。 我的计划是:
问题: 我需要一个能识别双引号内的逗号并替换它的函数。
以下是一个示例行:
“MRS Brown”,“4611 BEAUMONT ST”,“”,“WARRIOR RUN,PA”
答案 0 :(得分:4)
您的文件似乎已经采用CSV投诉格式。任何好的CSV阅读器都能正确读取它。
如果您的问题只是正确读取字段值,那么您需要以正确的方式阅读它。
这是一种方法:
using Microsoft.VisualBasic.FileIO;
private void button1_Click(object sender, EventArgs e)
{
TextFieldParser tfp = new TextFieldParser("C:\\Temp\\Test.csv");
tfp.Delimiters = new string[] { "," };
tfp.HasFieldsEnclosedInQuotes = true;
while (!tfp.EndOfData)
{
string[] fields = tfp.ReadFields();
// do whatever you want to do with the fields now...
// e.g. remove the commas and double-quotes from the fields.
for (int i = 0; i < fields.Length;i++ )
{
fields[i] = fields[i].Replace(","," ").Replace("\"","");
}
// this is to show what we got as the output
textBox1.AppendText(String.Join("\t", fields) + "\n");
}
tfp.Close();
}
修改强>
我刚注意到这个问题是在C#,VB.NET-2010下提交的。 这是VB.NET版本,以防你在VB中编码。
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim tfp As New FileIO.TextFieldParser("C:\Temp\Test.csv")
tfp.Delimiters = New String() {","}
tfp.HasFieldsEnclosedInQuotes = True
While Not tfp.EndOfData
Dim fields() As String = tfp.ReadFields
'' do whatever you want to do with the fields now...
'' e.g. remove the commas and double-quotes from the fields.
For i As Integer = 0 To fields.Length - 1
fields(i) = fields(i).Replace(",", " ").Replace("""", "")
Next
'' this is to show what we got as the output
TextBox1.AppendText(Join(fields, vbTab) & vbCrLf)
End While
tfp.Close()
End Sub
答案 1 :(得分:2)
这是一个简单的函数,它将删除嵌入在字符串中两个双引号之间的逗号。您可以传入一个长字符串,该字符串多次出现&#34; abc,123&#34;,10/13/12,&#34;某些描述&#34; ...等。它也会删除双引号。
Private Function ParseCommasInQuotes(ByVal arg As String) As String
Dim foundEndQuote As Boolean = False
Dim foundStartQuote As Boolean = False
Dim output As New StringBuilder()
'44 = comma
'34 = double quote
For Each element As Char In arg
If foundEndQuote Then
foundStartQuote = False
foundEndQuote = False
End If
If element.Equals(Chr(34)) And (Not foundEndQuote) And foundStartQuote Then
foundEndQuote = True
Continue For
End If
If element.Equals(Chr(34)) And Not foundStartQuote Then
foundStartQuote = True
Continue For
End If
If (element.Equals(Chr(44)) And foundStartQuote) Then
'skip the comma...its between double quotes
Else
output.Append(element)
End If
Next
Return output.ToString()
End Function
答案 2 :(得分:2)
感谢Baz,VB中的Glockster答案,我只是用C#转换它,它的效果很好。使用此代码,您不需要任何第三方解析器。
string line = reader.ReadLine();
line = ParseCommasInQuotes(line);
private string ParseCommasInQuotes(string arg)
{
bool foundEndQuote = false;
bool foundStartQuote = false;
StringBuilder output = new StringBuilder();
//44 = comma
//34 = double quote
foreach (char element in arg)
{
if (foundEndQuote)
{
foundStartQuote = false;
foundEndQuote = false;
}
if (element.Equals((Char)34) & (!foundEndQuote) & foundStartQuote)
{
foundEndQuote = true;
continue;
}
if (element.Equals((Char)34) & !foundStartQuote)
{
foundStartQuote = true;
continue;
}
if ((element.Equals((Char)44) & foundStartQuote))
{
//skip the comma...its between double quotes
}
else
{
output.Append(element);
}
}
return output.ToString();
}
答案 3 :(得分:0)
听起来好像你所描述的内容最终会成为一个csv文件,但回答你的问题我会这样做。
首先,您需要将文本文件转换为可以循环使用的一些可用代码,如下所示:
public static List<String> GetTextListFromDiskFile(String fileName)
{
List<String> list = new List<String>();
try
{
//load the file into the streamreader
System.IO.StreamReader sr = new System.IO.StreamReader(fileName);
//loop through each line of the file
while (sr.Peek() >= 0)
{
list.Add(sr.ReadLine());
}
sr.Close();
}
catch (Exception ex)
{
list.Add("Error: Could not read file from disk. Original error: " + ex.Message);
}
return list;
}
然后循环遍历列表并使用简单的foreach循环并在列表上运行replace,如下所示:
foreach (String item in list)
{
String x = item.Replace("\",\"", "\" \"");
x = x.Replace("\"", "");
}
执行此操作后,您需要逐行创建csv文件。我会再次使用StringBuilder,然后只需执行一个sb.AppendLine(x)来创建将成为文本文件的String,然后使用类似的东西将其写入磁盘。
public static void SaveFileToDisk(String filePathName, String fileText)
{
using (StreamWriter outfile = new StreamWriter(filePathName))
{
outfile.Write(fileText);
}
}
答案 4 :(得分:0)
我以前不理解你的问题。现在我很确定我做对了:
TextFieldParser parser = new TextFieldParser(@"c:\file.csv");
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(",");
while (!parser.EndOfData)
{
//Processing row
string[] fields = parser.ReadFields();
foreach (string field in fields)
{
//TODO: Do whatever you need
}
}
parser.Close();
答案 5 :(得分:0)
var result = Regex.Replace(input,
@"[^\""]([^\""])*[^\""]",
m => m.Value.Replace(",", " ") );
答案 6 :(得分:0)
这对我有用。希望它可以帮助别人。
Private Sub Command1_Click()
Open "c:\\dir\file.csv" For Input As #1
Open "c:\\dir\file2.csv" For Output As #2
Do Until EOF(1)
Line Input #1, test$
99
c = InStr(test$, """""")
If c > 0 Then
test$ = Left$(test$, c - 1) + Right$(test$, Len(test$) - (c + 1))
GoTo 99
End If
Print #2, test$
Loop
End Sub
答案 7 :(得分:0)
在开始逐行处理它之前,我会做所有的事情。 另外,请签出CsvHelper。快速简便。只需将您的结果放入一个TextReader中,然后将其传递给CvsReader。
这是您的逗号(双引号),然后是随后的双引号剥离器。
using (TextReader reader = File.OpenText(file))
{
// remove commas and double quotes inside file
var pattern = @"\""(.+?,.+)+\""";
var results = Regex.Replace(reader.ReadToEnd(), pattern, match => match.Value.Replace(",", " "));
results = results.Replace("\"", "");
}