我们需要找到几个整数排序数组的交集。 这是一个例子:
示例:
Input:
1,3,7,8
2,3,8,10
3,10,11,12,13,14
minSupport = 1
Output:
1 and 2: 2, 8
1 and 3: 3
2 and 3: 3, 10
我写了算法,它运行得很快。
var minSupport = 2;
var elementsCount = 10000;
var random = new Random(123);
// Numbers of each array are unique
var sortedArrays = Enumerable.Range(0,elementsCount)
.Select(x => Enumerable.Range(0,30).Select(t => random.Next(1000)).Distinct()
.ToList()).ToList();
var result = new List<int[]>();
var resultIntersection = new List<List<int>>();
foreach (var array in sortedArrays)
{
array.Sort();
}
var sw = Stopwatch.StartNew();
//****MAIN PART*****//
// This number(max value which array can contains) is known.
// Ofcourse we can use dictionary if donnt know maxValue
var maxValue = 1000;
var reverseIndexDict = new List<int>[maxValue];
for (int i = 0; i < maxValue; i++)
{
reverseIndexDict[i] = new List<int>();
}
for (int i = 0; i < sortedArrays.Count; i++)
{
for (int j = 0; j < sortedArrays[i].Count; j++)
{
reverseIndexDict[sortedArrays[i][j]].Add(i);
}
}
var resultMatrix = new List<int>[sortedArrays.Count,sortedArrays.Count];
for (int i = 0; i < sortedArrays.Count; i++)
{
for (int j = 0; j < sortedArrays[i].Count; j++)
{
var sortedArraysij = sortedArrays[i][j];
for (int k = 0; k < reverseIndexDict[sortedArraysij].Count; k++)
{
if(resultMatrix[i,reverseIndexDict[sortedArraysij][k]]==null) resultMatrix[i,reverseIndexDict[sortedArraysij][k]] = new List<int>();
resultMatrix[i,reverseIndexDict[sortedArraysij][k]].Add(sortedArraysij);
}
}
}
//*****************//
sw.Stop();
Console.WriteLine(sw.Elapsed);
但是当元素数量大于10000时,我的代码会因outofmemoryException而崩溃。我如何改进算法或我可以做些什么来解决这个问题?
答案 0 :(得分:0)
使用Distinct
方法,如下所示:
...
var theDistinctListOfInts = new List<int>();
foreach(var listOfInts in theListsOfInts)
{
theDistinctListOfInts = theDistinctListOfInts.Intersect(listOfInts);
}
...
答案 1 :(得分:0)
如果您知道数组可以拥有的最大整数数,则可以执行以下操作:
var histoMatrix = new int[1000]; // the max number in arrays is 1000 here
for (int i = 0; i < sortedArrays.Count; i++)
{
for (int j = 0; j < sortedArrays[i].Count; j++)
{
var sortedArraysij = sortedArrays[i][j];
histoMatrix[sortedArraysij]++;
}
}
var resultMatrix = new List<int>();
for (int i = 0; i < 1000; i++)
{
if (histoMatrix[i] == sortedArrays.Count)
resultMatrix.Add(histoMatrix[i]);
}
在这种情况下,您甚至不需要对数组进行排序。
希望有所帮助