我有一个非常大的逗号分隔文本文件。如上所述,每个字段由逗号分隔并用引号(所有字符串)包围。问题是某些字段包含该字段内多行的CR。因此,当我执行ReadLine时,它会在该CR处停止。如果我能告诉它只停止CRLF组合,那将是很好的。
有没有人有任何snappy方法来做到这一点?文件可能非常大。
答案 0 :(得分:2)
如果您想要特定的ReadLine
,为什么不实施它?
public static class MyFileReader {
public static IEnumerable<String> ReadLineCRLF(String path) {
StringBuilder sb = new StringBuilder();
Char prior = '\0';
Char current = '\0';
using (StreamReader reader = new StreamReader(path)) {
int v = reader.Read();
if (v < 0) {
if (prior == '\r')
sb.Append(prior);
yield return sb.ToString();
yield break;
}
prior = current;
current = (Char) v;
if ((current == '\n') && (prior == '\r')) {
yield return sb.ToString();
sb.Clear();
}
else if (current == '\r') {
if (prior == '\r')
sb.Append(prior);
}
else
sb.Append(current);
}
}
}
然后使用它
var lines = MyFileReader
.ReadLineCRLF(@"C:\MyData.txt");
答案 1 :(得分:1)
如何使用
string line = File.ReadAllText("input.txt"); // Read the text in one line
然后将其拆分为回车/换行符,如下所示:
var split = line.Split('\n'); // I'm not really sure it's \n you'll need, but it's something!
然后在循环中逐行处理
foreach(var line in split) { ... }