我有2个巨型名单(每个超过2000个)
我想解析&比较它们。
列表如下:
zone "exampledomain.com" {
zone "exampledomain2.com" {
zone "exampledomain3.com" {
zone "exampledomain4.com" {
zone "exampledomain5.com" {
zone "exampledomain6.com" {
zone "exampledomain7.com" {
另一个列表是什么样的:
zone "exampledomain.com" {
zone "exampledomain3.com" {
zone "exampledomain5.com" {
zone "exampledomain7.com" {
两个列表都具有相同格式的区域“____”{ 我想解析,以便我可以比较域,然后得到域的差异所以我知道另一个缺少什么,他们应该都有相同的结果。
我遇到过这段代码:
static void Main(string[] args)
{
string s1 = "i have a car a car";
string s2 = "i have a new car bmw";
List<string> diff;
IEnumerable<string> set1 = s1.Split(' ').Distinct();
IEnumerable<string> set2 = s2.Split(' ').Distinct();
if (set2.Count() > set1.Count())
{
diff = set2.Except(set1).ToList();
}
else
{
diff = set1.Except(set2).ToList();
}
}
但我想知道考虑到每个列表中有超过2000行,最好的方法是什么。
答案 0 :(得分:0)
您提供的示例仅显示列表1,其中列表2中的项目已删除。如果您还想要列表2中不在列表1中的内容,则必须进行两次查询
for(auto& entry : rangeCounts) {
if(y >= entry.first.first && y =< entry.first.second)
++entry.second;
}
我不确定在执行Except时涉及哪些代码,但是如果您希望看到如何生成包含差异的两个列表的实现,那么这里是一个解决方案:
var difference1 = list1.Except(list2);
var difference2 = list2.Except(list1);
我不知道LINQ怎么可能更快地做到这一点,但我的例程将处理重复条目,例如值&#34; 1&#34;在下面的例子中,LINQ赢了。因此,在选择使用哪种而不仅仅是速度差异时请记住这一点。
static void Differerence(
IEnumerable<string> source1, IEnumerable<string> source2,
out List<string> difference1, out List<string> difference2)
{
//Move the data from the sources into ordered queues
var sourceValues1 = new Queue<string>(source1.OrderBy(x => x));
var sourceValues2 = new Queue<string>(source2.OrderBy(x => x));
difference1 = new List<string>();
difference2 = new List<string>();
while(sourceValues1.Count > 0 && sourceValues2.Count > 0)
{
string value1 = sourceValues1.Peek();
string value2 = sourceValues2.Peek();
switch (string.Compare(value1, value2))
{
//If they match then don't add difference to either list
case 0:
sourceValues1.Dequeue();
sourceValues2.Dequeue();
break;
//The left queue has the lowest value, record that and move on
case -1:
difference1.Add(value1);
sourceValues1.Dequeue();
break;
//The right queue has the lowest value, record that and move on
case 1:
difference2.Add(value2);
sourceValues2.Dequeue();
break;
}
}
//At least one of the queues is empty, so everything left in the other queue
difference1.AddRange(sourceValues1);
difference2.AddRange(sourceValues2);
}
如果您需要
,将两个结果合并为一个很容易static void Main(string[] args)
{
var list1 = new string[] { "1", "1", "3", "5", "7", "9" };
var list2 = new string[] { "1", "2", "4", "6", "9", "10" };
var difference1 = list1.Except(list2);
var difference2 = list2.Except(list1);
List<string> differenceX1;
List<string> differenceX2;
Differerence(list1, list2, out differenceX1, out differenceX2);
}
答案 1 :(得分:-1)
HashSets用于唯一元素列表:
https://msdn.microsoft.com/en-us/library/bb359438(v=vs.110).aspx
SELECT DISTINCT
hi.SKUNo [HostSKU] ,
SUBSTRING(vi.GTIN, 3, 14) [GTIN] ,
CASE vg.VendorGroup
WHEN vg.VendorGroup THEN vg.VendorGroup
ELSE v.VendorNo
END [VendorNo] ,
'Inv_Full_Sync' [Reason Code] ,
'Vendor Inventory Full Sync' [Reason Text] ,
CASE vi.EncodeData
WHEN 'Y' THEN ii.Quantity1
ELSE 0
END [Quantity] , --< quantity
'ONHAND' [OnHand] ,
RTRIM(v.Category) [Vendor Category]
FROM ItemInventory ii
INNER JOIN HostItems hi ON hi.ItemId = ii.ItemId
INNER JOIN VendorItems vi ON vi.ItemId = ii.ItemId
AND vi.VendorNo = ii.VendorNo
INNER JOIN Vendors v ON v.VendorNo = ii.VendorNo
LEFT JOIN dbo.VendorGroups vg ON vg.VendorNo = v.VendorNo
WHERE QtyType = 0
AND [Quantity] > 0 --< here is where it bombs..
ORDER BY VendorNo ,
hi.SKUNo;