我正在用C#代码读取输入文本文件。该文件的列分隔符为“|”和行分隔符为'\ n'。以下是测试数据 -
1001 | Name | XYZ | Department1 Roll no 1. (\r\n)
1002 | Name | ABC | Department2 Roll No 2. (\r\n)
1003 | Name | PQR | Department3 (\r\n)
Roll (\r\n)
no3. (\r\n)
1004 | Name | MNO | Department4 Roll No 4. (\r\n)
1005 | Name | DEF | Department5 Roll No 5. (\r\n)
前两个记录格式正确。但是,第三条记录插入错误。我想按照我的其他记录格式化它。
我为此编写了C#代码,如下所示 -
string text = File.ReadAllText(inputfile);
text = text.Replace(@"\r\n", " ");
File.WriteAllText(ouutputfile, text);
然而,它不适合我。任何人都可以帮我解决这个问题吗?
我们可以有一个正则表达式吗?
答案 0 :(得分:1)
使用File.ReadAllLines
反向处理,如Sergii所述。这将允许您检查每一行以查看它是否与预期格式匹配,或者是否由于换行不正确而创建了该行。如果当前行是错误放置的换行符的结果,那么您只需将它附加到前一行以获得结果输出。
static void ProcessFile(string inputfile, string outputfile)
{
// Read the files by lines.
string[] lines = File.ReadAllLines(inputfile);
// We'll process in reverse, so create a stack (LIFO) for the results.
Stack<string> results = new Stack<string>();
// Process each line, checking that if it doesn't match the format, then we append to previous line.
string resultLine = "";
for (int i = lines.Length - 1; i >= 0; --i)
{
resultLine = lines[i] + resultLine;
int lineParts = resultLine.Split('|').Count();
if (lineParts == 4) // Well-formatted line.
{
results.Push(resultLine);
resultLine = "";
}
else if (lineParts < 4) // An invalid linefeed from the previous entry.
{
// We prepend a space to replace the linebreak; then just continue through loop, where the current line will be appended to previous.
resultLine = " " + resultLine;
}
else // lineParts > 4... unexpected
{
throw new InvalidOperationException("What to do here?");
}
}
// Now that all our lines have been fixed, write them back out.
File.WriteAllLines(outputfile, results.ToArray());
}
注意:这不是最有效的,因为您必须确保要处理的文件足够小,以便在内存中基本上适合3次,但这只是1次以上比你原来的解决方案。如果您的文件很大,您可能希望修改解决方案以对流进行操作,而不是将其全部保存在本地变量中。
答案 1 :(得分:0)
var text = File.ReadAllText(inputfile);
var rawParts = text.Split(new string[] { "\n" });
var proParts = new List<string>(rawParts.Take(2));
proParts.Add(rawParts[2] + " " rawParts[3] + " " rawParts[4]);
proParts.AddRange(rawParts.Skip(5));
var sb = new StringBuilder();
foreach (var part in proParts)
sb.Append(part + "\n");
File.WriteAllText(outputfile, sb.ToString());