我正在寻找将数组拆分成固定大小的块的最快方法(当然,最后一个可以更小)。我在整个网站上看过并没有找到任何性能方面的比较,所以我写了它们,结果如下:
以微秒为单位的时间,平均值/错误/ stddev
int[]
- 30.02 | 0.1002 | 0.0937
IEnumerable<int>
- 76.67 | 0.2146 | 0.1902
更新:以下版本(在@Markus的回答中)为139.5 | 0.6702 | 0.5597
最流行的SO和经常推荐的使用LINQ的GroupBy
和index/chunkSize
的方法是不行的 - 267微秒比上述任何一种实现都要大得多。
有没有更快的方法来拆分数组?
P.S。
这是Array
和IEnumerable<T>
的代码:
/// <summary>
/// Splits <paramref name="source"/> into chunks of size not greater than <paramref name="chunkMaxSize"/>
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="source">Array to be split</param>
/// <param name="chunkMaxSize">Max size of chunk</param>
/// <returns><see cref="IEnumerable{T}"/> of <see cref="Array"/> of <typeparam name="T"/></returns>
public static IEnumerable<T[]> AsChunks<T>(this T[] source, int chunkMaxSize)
{
var pos = 0;
var sourceLength = source.Length;
do
{
var len = Math.Min(pos + chunkMaxSize, sourceLength) - pos;
if (len == 0)
{
yield break;;
}
var arr = new T[len];
Array.Copy(source, pos, arr, 0, len);
pos += len;
yield return arr;
} while (pos < sourceLength);
}
/// <summary>
/// Splits <paramref name="source"/> into chunks of size not greater than <paramref name="chunkMaxSize"/>
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="source"><see cref="IEnumerable{T}"/> to be split</param>
/// <param name="chunkMaxSize">Max size of chunk</param>
/// <returns><see cref="IEnumerable{T}"/> of <see cref="Array"/> of <typeparam name="T"/></returns>
public static IEnumerable<T[]> AsChunks<T>(this IEnumerable<T> source, int chunkMaxSize)
{
var arr = new T[chunkMaxSize];
var pos = 0;
foreach (var item in source)
{
arr[pos++] = item;
if (pos == chunkMaxSize)
{
yield return arr;
arr = new T[chunkMaxSize];
pos = 0;
}
}
if (pos > 0)
{
Array.Resize(ref arr, pos);
yield return arr;
}
}
P.P.S使用BenchmarkDotNet测试的完整解决方案为on GitHub。
答案 0 :(得分:6)
不确定这是如何叠加的(ArraySegment
),但试一试。我已经避免使用迭代器,但不确定这是否真的是一个收获。
public static IEnumerable<T[]> AsChunks<T>(
this T[] source, int chunkMaxSize)
{
var chunks = source.Length / chunkMaxSize;
var leftOver = source.Length % chunkMaxSize;
var result = new List<T[]>(chunks + 1);
var offset = 0;
for (var i = 0; i < chunks; i++)
{
result.Add(new ArraySegment<T>(source,
offset,
chunkMaxSize).ToArray());
offset += chunkMaxSize;
}
if (leftOver > 0)
{
result.Add(new ArraySegment<T>(source,
offset,
leftOver).ToArray());
}
return result;
}
答案 1 :(得分:0)
我使用了此代码:
来源:http://blogs.msdn.com/b/pfxteam/archive/2012/11/16/plinq-and-int32-maxvalue.aspx
public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int batchSize) {
if (batchSize <= 0) {
throw new ArgumentOutOfRangeException(nameof(batchSize));
}
using(var enumerator = source.GetEnumerator()) {
while (enumerator.MoveNext()) {
yield return YieldBatchElements(enumerator, batchSize - 1);
}
}
}
private static IEnumerable<T> YieldBatchElements<T>(IEnumerator<T> source, int batchSize) {
yield return source.Current;
for (var i = 0; i < batchSize && source.MoveNext(); i++) {
yield return source.Current;
}
}
答案 2 :(得分:0)
您可以使用这几行代码
int chunkSize = 4;
int skipIndex = 0;
string[] words = { "one", "two", "three", "four", "five", "six" };
string[][] chunckResult = new string[words.Length][];
for (int index = 0; index < words.Length; index++)
{
chunckResult[index] = words.Skip(skipIndex).Take(chunkSize).ToArray();
skipIndex += chunkSize;
if (skipIndex == words.Length)
{
break;
}
}