列表使用多个块

时间:2013-02-07 11:55:31

标签: c# .net

List是数组包装器。当您将项目添加到列表时,它会创建越来越大的内层阵列(以前的阵列是垃圾回收的)。但是如果你在某些时候处理大型列表,即使由于内存碎片而有空闲内存,你也会得到OutOfMemoryException。我正在寻找一个ICollection实现,它可以使用与MemoryTributary类似的一组数组。

更新

我在这里找到了BigArray实现:

http://blogs.msdn.com/b/joshwil/archive/2005/08/10/450202.aspx

虽然它试图解决其他问题(创建一个> 2GB大小的数组),但它也解决了我的问题。但是这个实现并不完整,甚至无法编译。所以如果我没有找到更好的,我会改进这个并使用它。

2 个答案:

答案 0 :(得分:1)

如果您粗略估计最大尺寸,则可以初始化具有容量的List。据我所知,这将避免重型垃圾收集(使用旧的较小数组),这也可能避免内存碎片。

答案 1 :(得分:0)

我没有找到任何好的实施方案。所以我写了自己的ChunkyList。通常ChunkyList是数组包装器的列表。最初块的大小为1,但每次需要扩展当前块(类似于List的行为)时,它会乘以2(当它达到MaxBlockSize时,会创建下一个块)。

这是一个通用的ChunkyList:

public class ChunkyList<T> : IList<T>
{
    public ChunkyList()
    {
        MaxBlockSize = 65536;
    }

    public ChunkyList(int maxBlockSize)
    {
        MaxBlockSize = maxBlockSize;
    }

    private List<T[]> _blocks = new List<T[]>();

    public int Count { get; private set; }

    public int MaxBlockSize { get; private set; }

    public bool IsReadOnly
    {
        get { throw new NotImplementedException(); }
    }

    public IEnumerator<T> GetEnumerator()
    {
        var index = 0;
        foreach (var items in _blocks)
            foreach (var item in items)
            {
                yield return item;
                if (Count <= ++index)
                    break;
            }
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }

    public void Add(T item)
    {
        var indexInsideBlock = GetIndexInsideBlock(Count);
        if (indexInsideBlock == 0)
            _blocks.Add(new T[1]);
        else
        {
            var lastBlockIndex = _blocks.Count - 1;
            var lastBlock = _blocks[lastBlockIndex];
            if(indexInsideBlock >= lastBlock.Length)
            {
                var newBlockSize = lastBlock.Length*2;
                if (newBlockSize >= MaxBlockSize)
                    newBlockSize = MaxBlockSize;

                _blocks[lastBlockIndex] = new T[newBlockSize];
                Array.Copy(lastBlock, _blocks[lastBlockIndex], lastBlock.Length);
            }
        }

        _blocks[GetBlockIndex(Count)][indexInsideBlock] = item;

        Count++;
    }

    public void AddRange(IEnumerable<T> items)
    {
        foreach (var item in items)
            Add(item);
    }

    public void Clear()
    {
        throw new NotImplementedException();
    }

    public bool Contains(T item)
    {
        throw new NotImplementedException();
    }

    public void CopyTo(T[] array, int arrayIndex)
    {
        throw new NotImplementedException();
    }

    public bool Remove(T item)
    {
        throw new NotImplementedException();
    }

    public int IndexOf(T item)
    {
        throw new NotImplementedException();
    }

    public void Insert(int index, T item)
    {
        throw new NotImplementedException();
    }

    public void RemoveAt(int index)
    {
        throw new NotImplementedException();
    }

    public T this[int index]
    {
        get
        {
            if (index >= Count)
                throw new ArgumentOutOfRangeException("index");

            var blockIndex = GetBlockIndex(index);
            var block = _blocks[blockIndex];

            return block[GetIndexInsideBlock(index)];
        }
        set { throw new NotImplementedException(); }
    }

    private int GetBlockIndex(int index)
    {
        return index / MaxBlockSize;
    }

    private long GetIndexInsideBlock(int index)
    {
        return index % MaxBlockSize;
    }
}

这里有测试证明这个实现有效:

 [TestClass]
    public class ChunkyListTests
    {
        [TestMethod]
         public void GetEnumerator_NoItems()
         {
             var chunkyList = new ChunkyList<float>();

             var wasInsideForeach = false;
             foreach (var item in chunkyList)
                 wasInsideForeach = true;

             Assert.IsFalse(wasInsideForeach);
         }

        [TestMethod]
        public void GetEnumerator_MaxBlockSizeOfThreeWithThreeItems()
        {
            var chunkyList = new ChunkyList<float> (3) { 1, 2, 3 };

            var wasInsideForeach = false;
            var iteratedItems = new List<float>();
            foreach (var item in chunkyList)
            {
                wasInsideForeach = true;
                iteratedItems.Add(item);
            }

            Assert.IsTrue(wasInsideForeach);
            CollectionAssert.AreEqual(new List<float> { 1, 2, 3 }, iteratedItems);
        }

        [TestMethod]
        public void GetEnumerator_MaxBlockSizeOfTwoWithThreeItems()
        {
            var chunkyList = new ChunkyList<float>(2) {1,2,3};
            var wasInsideForeach = false;
            var iteratedItems = new List<float>();

            foreach (var item in chunkyList)
            {
                wasInsideForeach = true;
                iteratedItems.Add(item);
            }

            Assert.IsTrue(wasInsideForeach);
            CollectionAssert.AreEqual(new List<float>() { 1, 2, 3 }, iteratedItems);
            Assert.AreEqual(chunkyList.MaxBlockSize, 2);
        }
    }

P上。 S.我只实现了我的代码中使用的那些IList方法。因此,欢迎您改进此实施。