我试图找出如何从500 000行的大文本文档中删除特定字符串。按内容查找行,但同时获取文本文档顺序中的当前行索引值(不得受到干扰)删除找到行的下一行或上一行,换句话说,按索引查找最近行,以删除大文档。因为我尝试使用File.WriteAllLines
程序的任何方法都会挂起这样的大小。我有活跃的请求这个文件,似乎需要找到一些其他方式。例如,文件内容是:
1. line 1
2. line 2
3. line 3
4. line 4
5. line 5
和查找和删除的行是:
string input = "line 3"
获取此结果,删除找到的行索引和下一行的下一行索引+ 1,如果找到行索引号为奇数:
line 1
line 2
line 5
并且同时能够删除找到的行索引和索引 - 前一行1,如果找到行索引是搜索字符串的偶数:
string input = "line 4"
,结果应为:
line 1
line 2
line 5
并知道文本文档中是否存在行。
写入同一个文件。
答案 0 :(得分:1)
如果要处理非常大的文件,则应使用FileStream
以避免将所有内容加载到内存中。
为了满足您的上一个要求,您可以逐行阅读这些行。它实际上使您的代码更简单。
var inputFileName = @"D:\test-input.txt";
var outputFileName = Path.GetTempFileName();
var search = "line 4";
using (var strInp = File.Open(inputFileName, FileMode.Open))
using (var strOtp = File.Open(outputFileName, FileMode.Create))
using (var reader = new StreamReader(strInp))
using (var writer = new StreamWriter(strOtp))
{
while (reader.Peek() >= 0)
{
var lineOdd = reader.ReadLine();
var lineEven = (string)null;
if (reader.Peek() >= 0)
lineEven = reader.ReadLine();
if(lineOdd != search && lineEven != search)
{
writer.WriteLine(lineOdd);
if(lineEven != null)
writer.WriteLine(lineEven);
}
}
}
// at this point, operation is sucessfull
// rename temp file with original one
File.Delete(inputFileName);
File.Move(outputFileName, inputFileName);
答案 1 :(得分:0)
让输入文件为mongo.connect('mongodb://localhost:27017', function (err, db) {
if (err) {
console.log("error: " + err); // logs nothing
} else {
var users = db.collection("users");
var tasks = db.collection("tasks");
app.post("/login", function(req, res) {
var emailRegex = /^(([^<>()\[\]\\.,;:\s@"]+(\.[^<>()\[\]\\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/;
var userInDb;
var userEmail = req.body.email;
var userPassword = req.body.password;
console.log(req.body.email); // logs "johndoe@gmail.com"
console.log(req.body.password); // logs "pass"
if (!userEmail || !userPassword) {
return res.sendStatus(403);
} else if ( !emailRegex.test(userEmail)) {
return res.sendStatus(403);
} else {
users.findOne( { "email": userEmail, "password": userPassword }, function(err, results) {
console.log(results); // logs "null"
if(err) {
console.log("error: " + err); // logs nothing
res.sendStatus(403);
} else {
console.log("here"); // logs "here"
res.sendStatus(200);
}
});
}
});
}
});
,然后您可以使用inputFile.txt
方法获取该特定文件中的所有行。然后使用File.ReadAllLines()
方法查找该列表中特定行的索引,如果未找到则表示它将返回IndexOf()
,然后使用-1
删除该特定索引处的行。考虑一下代码:
RemoveAt()
如果要将其写回文件,请使用以下行:
List<string> linesInFile = File.ReadAllLines(filePath).ToList(); // gives you list of lines
string input = "line 3";
int lineIndex = linesInFile.IndexOf(input);
if (lineIndex != -1)
{
linesInFile.RemoveAt(lineIndex);
}
// If you may have more number of match for particular line means you can try this as well :
linesInFile.RemoveAll(x=> x== input);
答案 2 :(得分:0)
private static void RemoveLines(string lineToRemove, bool skipPrevious, bool skipNext)
{
string previousLine = string.Empty;
string currentLine;
bool isNext = false;
using (StreamWriter sw = File.CreateText(@"output.txt"))
{
using (StreamReader sr = File.OpenText(@"input.txt"))
{
while ((currentLine = sr.ReadLine()) != null)
{
if (isNext)
{
currentLine = string.Empty;
isNext = false;
}
if (currentLine == lineToRemove)
{
if (skipPrevious)
{
previousLine = string.Empty;
}
if (skipNext)
{
currentLine = string.Empty;
isNext = true;
}
}
if (previousLine != string.Empty && previousLine != lineToRemove)
{
sw.WriteLine(previousLine);
}
previousLine = currentLine;
}
}
if (previousLine != string.Empty && previousLine != lineToRemove)
{
sw.WriteLine(previousLine);
}
}
}
尚未测试过,但这会给出必要的指示。