我想一次读取并处理10行以上的GB文件,但是还没有找到解决方案直到最后吐出10行。
我的最后一次尝试是:
int n = 10;
foreach (var line in File.ReadLines("path")
.AsParallel().WithDegreeOfParallelism(n))
{
System.Console.WriteLine(line);
Thread.Sleep(1000);
}
我见过使用缓冲区大小的解决方案,但我想在整行中阅读。
答案 0 :(得分:1)
默认行为是一次性读取所有行,如果你想要读取的数量少于你需要深入了解它如何读取它们并得到一个StreamReader然后让你控制阅读过程
using (StreamReader sr = new StreamReader(path))
{
while (sr.Peek() >= 0)
{
Console.WriteLine(sr.ReadLine());
}
}
它还有一个ReadLineAsync
方法,它将返回一个任务
如果您在ConcurrentBag中包含这些任务,则可以非常轻松地一次在10行上运行处理。
var bag =new ConCurrentBag<Task>();
using (StreamReader sr = new StreamReader(path))
{
while(sr.Peek() >=0)
{
if(bag.Count < 10)
{
Task processing = sr.ReadLineAsync().ContinueWith( (read) => {
string s = read.Result;//EDIT Removed await to reflect Scots comment
//process line
});
bag.Add(processing);
}
else
{
Task.WaitAny(bag.ToArray())
//remove competed tasks from bag
}
}
}
请注意,此代码仅供参考,不得按原样使用;
如果您想要的只是最后十行,那么您可以在此处获得解决方案 How to read a text file reversely with iterator in C#
答案 1 :(得分:0)
这种方法会创建&#34;页面&#34;您文件中的行。
public static IEnumerable<string[]> ReadFileAsLinesSets(string fileName, int setLen = 10)
{
using (var reader = new StreamReader(fileName))
while (!reader.EndOfStream)
{
var set = new List<string>();
for (var i = 0; i < setLen && !reader.EndOfStream; i++)
{
set.Add(reader.ReadLine());
}
yield return set.ToArray();
}
}
...更有趣的版本......
class Example
{
static void Main(string[] args)
{
"YourFile.txt".ReadAsLines()
.AsPaged(10)
.Select(a=>a.ToArray()) //required or else you will get random data since "WrappedEnumerator" is not thread safe
.AsParallel()
.WithDegreeOfParallelism(10)
.ForAll(a =>
{
//Do your work here.
Console.WriteLine(a.Aggregate(new StringBuilder(),
(sb, v) => sb.AppendFormat("{0:000000} ", v),
sb => sb.ToString()));
});
}
}
public static class ToolsEx
{
public static IEnumerable<IEnumerable<T>> AsPaged<T>(this IEnumerable<T> items,
int pageLength = 10)
{
using (var enumerator = new WrappedEnumerator<T>(items.GetEnumerator()))
while (!enumerator.IsDone)
yield return enumerator.GetNextPage(pageLength);
}
public static IEnumerable<T> GetNextPage<T>(this IEnumerator<T> enumerator,
int pageLength = 10)
{
for (var i = 0; i < pageLength && enumerator.MoveNext(); i++)
yield return enumerator.Current;
}
public static IEnumerable<string> ReadAsLines(this string fileName)
{
using (var reader = new StreamReader(fileName))
while (!reader.EndOfStream)
yield return reader.ReadLine();
}
}
internal class WrappedEnumerator<T> : IEnumerator<T>
{
public WrappedEnumerator(IEnumerator<T> enumerator)
{
this.InnerEnumerator = enumerator;
this.IsDone = false;
}
public IEnumerator<T> InnerEnumerator { get; private set; }
public bool IsDone { get; private set; }
public T Current { get { return this.InnerEnumerator.Current; } }
object System.Collections.IEnumerator.Current { get { return this.Current; } }
public void Dispose()
{
this.InnerEnumerator.Dispose();
this.IsDone = true;
}
public bool MoveNext()
{
var next = this.InnerEnumerator.MoveNext();
this.IsDone = !next;
return next;
}
public void Reset()
{
this.IsDone = false;
this.InnerEnumerator.Reset();
}
}
答案 2 :(得分:-1)
注意ReadLines将读取GB文件的所有行,而不仅仅是您自己的行 打印。 你真的需要并行吗?