我正在使用一个具有长处理且返回许多结果的方法,但正确的结果可能是任何返回的结果,比方说300,000个结果但其余的700,000 是否正确返回在以下代码中检查:
//a that suppose to return a value at need.
//Main func might need few returns and not all so
static IEnumerable<int> foo() {
//long recusive process, might contain over 1 million results if being asked to yield all.
yield return ret;
}
static void Main(string[] args) {
var a = foo();
while (true) {
var p = a.Take(300); //takes first 300 every loop in the while-loop
foreach (var c in p) {
//does something with it
if (bar == true) //if it is the right one:
goto _break;
}
}
_break:
Console.Read(); //pause
}
不幸的是,代码会一次又一次地重新计算300次返回。
我怎么可能每次只抽取300个结果而不必再次从头开始(使用Skip(n)
然后Take(n)
)并且不将其转换为Collection
显然,IEnumerable
结构保留在函数foo
中。
在我开始使用yield
方法之前,我有一个线性无效的程序,结果比新程序更快。除了将foo()
的内容分离到外部方法之外,没有什么真正改变,所以我可以逐个产生结果,而不是先将它们全部放在一起,然后再处理。
然而,表现非常可怕。我说的是从300ms到700ms。
我注意到在询问所有结果(foo().ToArray()
)时,它甚至比使用yield return来检查是否bar == true
更快。
所以我想要做的是采取300->采样它们,如果没有找到 - >继续采取300's直到找到。
static void Main(string[] args) {
var a = loly();
while(true){
var p = a.Take(3);
foreach (var c in p) {
Console.Write(c);
if (c==4)
goto _break;
}
}
_break:
Console.Read();
}
static IEnumerable<int> loly() {
var l = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
for (int i = 0; i < 9; i++) {
yield return l[i];
}
}
此输出:123123123
等等
class Program {
static void Main(string[] args) {
var j = 0;
var a = new EnumerationPartitioner<int>(loly().GetEnumerator());
while(true) {
foreach (var c in a.Pull(3)) {
Console.WriteLine(c);
Console.WriteLine("("+(++j)+")");
}
if (a.Ended)
break;
}
foreach (var part in loly().ToInMemoryBatches(7)) {
foreach (var c in part) {
Console.WriteLine(c);
Console.WriteLine("("+(++j)+")");
}
}
Console.Read();
}
static IEnumerable<int> loly() {
var l = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
for (int i = 0; i < 9; i++) {
yield return l[i];
}
}
}
//Tallseth's method
public static class EnumerationPartitioner {
public static IEnumerable<IEnumerable<T>> ToInMemoryBatches<T>(this IEnumerable<T> source, int batchSize) {
List<T> batch = null;
foreach (var item in source)
{
if (batch == null)
batch = new List<T>();
batch.Add(item);
if (batch.Count != batchSize)
continue;
yield return batch;
batch = null;
}
if (batch != null)
yield return batch;
}
}
//MarcinJuraszek's method
public class EnumerationPartitioner<T> : IEnumerable<T> {
/// <summary>
/// Has the enumeration ended?
/// </summary>
public bool Ended {
get { return over; }
}
public IEnumerator<T> Enumerator { get; private set; }
public EnumerationPartitioner(IEnumerator<T> _enum) {
Enumerator = _enum;
}
/// <summary>
/// Has the enumeration ended
/// </summary>
private bool over = false;
/// <summary>
/// Items that were pulled from the <see cref="Enumerator"/>
/// </summary>
private int n = 0;
/// <summary>
/// Pulls <paramref name="count"/> items out of the <see cref="Enumerator"/>.
/// </summary>
/// <param name="count">Number of items to pull out the <see cref="Enumerator"/></param>
public List<T> Pull(int count) {
var l = new List<T>();
if (over) return l;
for (int i = 0; i < count; i++, n++) {
if ((Enumerator.MoveNext()) == false) {
over = true;
return l;
}
l.Add(Enumerator.Current);
}
return l;
}
/// <summary>
/// Resets the Enumerator and clears internal counters, use this over manual reset
/// </summary>
public void Reset() {
n = 0;
over = false;
Enumerator.Reset();
}
public IEnumerator<T> GetEnumerator() {
return Enumerator;
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() {
return Enumerator;
}
}
答案 0 :(得分:3)
我需要定期这样做。正如阿列克谢所暗示的那样,在处理这种形状的问题时,我想要的是可以容纳的数量。
public static IEnumerable<IEnumerable<T>> ToInMemoryBatches<T>(this IEnumerable<T> source, int batchSize)
{
List<T> batch = null;
foreach (var item in source)
{
if (batch == null)
batch = new List<T>();
batch.Add(item);
if (batch.Count != batchSize)
continue;
yield return batch;
batch = null;
}
if (batch != null)
yield return batch;
}
答案 1 :(得分:2)
您可以直接使用枚举器,而不是依赖foreach
循环:
static void Main(string[] args)
{
var a = loly();
var partitionSize = 3;
using (var enumerator = a.GetEnumerator())
{
var values = new List<int>(partitionSize);
for (int i = 0; i < 3; i++)
{
values.Clear();
for (int j = 0; j < partitionSize && enumerator.MoveNext(); j++)
{
values.Add(enumerator.Current);
}
foreach (var c in values)
{
Console.Write(c);
}
}
}
Console.Read();
}
答案 2 :(得分:0)
我做了两个方法,区别在于分区大小不固定,一个是使用分区大小和其他分区结束索引,如果最后一个分区未满,也会调整大小。
public static IEnumerable<T[]> PartitionBySize<T>(this IEnumerable<T> source, int[] sizes)
{
using (var iter = source.GetEnumerator())
foreach (var size in sizes)
if (iter.MoveNext())
{
var chunk = new T[size];
chunk[0] = iter.Current;
int i = 1;
for (; i < size && iter.MoveNext(); i++)
chunk[i] = iter.Current;
if (i < size)
Array.Resize(ref chunk, i);
yield return chunk;
}
else
yield break;
}
public static IEnumerable<T[]> PartitionByIdx<T>(this IEnumerable<T> source, int[] indexes)
{
int last = -1;
using (var iter = source.GetEnumerator())
foreach (var idx in indexes)
{
int size = idx - last;
last = idx;
if (iter.MoveNext())
{
var chunk = new T[size];
chunk[0] = iter.Current;
int i = 1;
for (; i < size && iter.MoveNext(); i++)
chunk[i] = iter.Current;
if (i < size)
Array.Resize(ref chunk, i);
yield return chunk;
}
else
yield break;
}
}