我希望在我的C#/ ASP.NET应用程序中实现一个相当简单的CSV检查程序 - 我的项目会自动从GridView为用户生成CSV,但我希望能够快速浏览每一行并查看它们是否具有相同的逗号数量,如果出现任何差异则抛出异常。到目前为止,我有这个,它确实有效,但我将很快描述一些问题:
int? CommaCount = null;
StringBuilder sb = new StringBuilder();
StringWriter sw = new StringWriter(sb);
String Str = null;
//This loops through all the headerrow cells and writes them to the stringbuilder
for (int k = 0; k <= (grd.Columns.Count - 1); k++)
{
sw.Write(grd.HeaderRow.Cells[k].Text + ",");
}
sw.WriteLine(",");
//This loops through all the main rows and writes them to the stringbuilder
for (int i = 0; i <= grd.Rows.Count - 1; i++)
{
StringBuilder RowString = new StringBuilder();
for (int j = 0; j <= grd.Columns.Count - 1; j++)
{
//We'll need to strip meaningless junk such as <br /> and
Str = grd.Rows[i].Cells[j].Text.ToString().Replace("<br />", "");
if (Str == " ")
{
Str = "";
}
Str = "\"" + Str + "\"" + ",";
RowString.Append(Str);
sw.Write(Str);
}
sw.WriteLine();
//The below code block ensures that each row contains the same number of commas, which is crucial
int RowCommaCount = CheckChar(RowString.ToString(), ',');
if (CommaCount == null)
{
CommaCount = RowCommaCount;
}
else
{
if (CommaCount!= RowCommaCount)
{
throw new Exception("CSV generated is corrupt - line " + i + " has " + RowCommaCount + " commas when it should have " + CommaCount);
}
}
}
sw.Close();
我的CheckChar方法:
protected static int CheckChar(string Input, char CharToCheck)
{
int Counter = 0;
foreach (char StringChar in Input)
{
if (StringChar == CharToCheck)
{
Counter++;
}
}
return Counter;
}
现在我的问题是,如果网格中的单元格包含逗号,我的check char方法仍会将这些作为分隔符计数,因此会返回错误。正如您在代码中看到的那样,我将所有值包含在“characters to'escape”中。在我的方法中忽略值中的逗号是多么简单?我假设我需要重写该方法很多。
答案 0 :(得分:0)
您可以使用与一个项匹配的正则表达式,并计算您的行中的匹配数。这种正则表达式的一个例子如下:
var itemsRegex =
new Regex(@"(?<=(^|[\" + separator + @"]))((?<item>[^""\" + separator +
@"\n]*)|(?<item>""([^""]|"""")*""))(?=($|[\" + separator + @"]))");
答案 1 :(得分:0)
只需执行以下操作(假设您不希望“在您的字段内”(否则这些需要一些额外处理)):
protected static int CheckChar(string Input, char CharToCheck, char fieldDelimiter)
{
int Counter = 0;
bool inValue = false;
foreach (char StringChar in Input)
{
if (StringChar == fieldDelimiter)
inValue = !inValue;
else if (!inValue && StringChar == CharToCheck)
Counter++;
}
return Counter;
}
这将导致inValue
在内部字段中为真。例如。将'"'
作为fieldDelimiter
传递,以忽略"..."
之间的所有内容。请注意,这不会处理转义"
(例如""
或\"
)。你必须自己添加这样的处理。
答案 2 :(得分:0)
在连接(混合)它们之前,应该检查字段(成分),而不是检查结果字符串(蛋糕)。这会让你做出改变,做一些有建设性的事情(逃避/替换)并抛出异常作为最后的手段。
通常,“。”在.csv字段中是合法的,只要引用字符串字段即可。所以内部“,”不应该是一个问题,但报价可能是。
答案 3 :(得分:0)
var rx = new Regex("^ ( ( \"[^\"]*\" ) | ( (?!$)[^\",] )+ | (?<1>,) )* $", RegexOptions.ExplicitCapture | RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline);
var matches = rx.Matches("Hello,World,How,Are\nYou,Today,This,Is,\"A beautiful, world\",Hi!");
for (int i = 1; i < matches.Count; i++) {
if (matches[i].Groups[1].Captures.Count != matches[i - 1].Groups[1].Captures.Count) {
throw new Exception();
}
}