问题:
我有两个可能长度不同的数组。我需要迭代两个数组并找到相似之处,添加和删除。
在C#中实现此目的的最快和最有效的方法是什么?
编辑:的 数组是预先排序的,它们可以包含50到100个项目之间的任何位置。此外,对速度和/或内存使用没有任何限制(但是,没有人喜欢内存耗费;)
例如:
String[] Foo_Old = {"test1", "test2", "test3"};
String[] Foo_New = {"test1", "test2", "test4", "test5"};
和
String[] Bar_Old = {"test1", "test2", "test4"};
String[] Bar_New = {"test1", "test3"};
的差异:
(关于Foo_New数组)
[Same] "test1" [Same] "test2" [Removed] "test3" [Added] "test4" [Added] "test5"
(关于Bar_New数组)
[Same] "test1" [Removed] "test2" [Removed] "test4" [Added] "test3"
答案 0 :(得分:18)
var Foo_Old = new[] { "test1", "test2", "test3" };
var Foo_New = new[] { "test1", "test2", "test4", "test5" };
var diff = Foo_New.Except( Foo_Old );
var inter = Foo_New.Intersect( Foo_Old );
var rem = Foo_Old.Except(Foo_New);
foreach (var s in diff)
{
Console.WriteLine("Added " + s);
}
foreach (var s in inter)
{
Console.WriteLine("Same " + s);
}
foreach (var s in rem)
{
Console.WriteLine("Removed " + s);
}
答案 1 :(得分:3)
我继续进行手动编码并在接受的答案中使用示例,手动编码的表现稍好一些。我处理我的字符串的方式略有不同。要考虑的其他因素包括:Except是否对数组进行了排序复制(因为它不能假定它已经排序),或者它是否进行某种散列或线性搜索(它实际上仅限于IEnumerable) - 对于已经排序的非常大的数组,这可能是一个问题)。你可以改变我的比较IEnumerable(这是更一般的)而不是IComparable []。
static void ArrayCompare(IComparable[] Old, IComparable[] New)
{
int lpOld = 0;
int lpNew = 0;
int OldLength = Old.Length;
int NewLength = New.Length;
while (lpOld < OldLength || lpNew < NewLength)
{
int compare;
if (lpOld >= OldLength) compare = 1;
else if (lpNew >= NewLength) compare = -1;
else compare = Old[lpOld].CompareTo(New[lpNew]);
if (compare < 0)
{
Debug.WriteLine(string.Format("[Removed] {0}", Old[lpOld].ToString()));
lpOld++;
}
else if (compare > 0)
{
Debug.WriteLine(string.Format("[Added] {0}", New[lpNew].ToString()));
lpNew++;
}
else
{
Debug.WriteLine(string.Format("[Same] {0}", Old[lpOld].ToString()));
lpOld++;
lpNew++;
}
}
}
static void ArrayCompare2(IComparable[] Old, IComparable[] New) {
var diff = New.Except( Old );
var inter = New.Intersect( Old );
var rem = Old.Except(New);
foreach (var s in diff)
{
Debug.WriteLine("Added " + s);
}
foreach (var s in inter)
{
Debug.WriteLine("Same " + s);
}
foreach (var s in rem)
{
Debug.WriteLine("Removed " + s);
}
}
static void Main(string[] args)
{
String[] Foo_Old = {"test1", "test2", "test3"};
String[] Foo_New = {"test1", "test2", "test4", "test5"};
String[] Bar_Old = {"test1", "test2", "test4"};
String[] Bar_New = {"test1", "test3"};
Stopwatch w1 = new Stopwatch();
w1.Start();
for (int lp = 0; lp < 10000; lp++)
{
ArrayCompare(Foo_Old, Foo_New);
ArrayCompare(Bar_Old, Bar_New);
}
w1.Stop();
Stopwatch w2 = new Stopwatch();
w2.Start();
for (int lp = 0; lp < 10000; lp++)
{
ArrayCompare2(Foo_Old, Foo_New);
ArrayCompare2(Bar_Old, Bar_New);
}
w2.Stop();
Debug.WriteLine(w1.Elapsed.ToString());
Debug.WriteLine(w2.Elapsed.ToString());
}
答案 2 :(得分:1)
由于您的数组已排序,您应该能够同时遍历数组,并在一次传递中确定每个元素是否在另一个数组中。 (类似于合并排序中的合并步骤。)您可以看到以下示例:
string[] oldVersion = { "test1", "test2", "test3" };
string[] newVersion = { "test1", "test2", "test4", "test5" };
int oldIndex = 0, newIndex = 0;
while ((oldIndex < oldVersion.Length) && (newIndex < newVersion.Length)) {
int comparison = oldVersion[oldIndex].CompareTo(newVersion[newIndex]);
if (comparison < 0)
Console.WriteLine("[Removed]\t" + oldVersion[oldIndex++]);
else if (comparison > 0)
Console.WriteLine("[Added]\t\t" + newVersion[newIndex++]);
else {
Console.WriteLine("[Same]\t\t" + oldVersion[oldIndex++]);
newIndex++;
}
}
while (oldIndex < oldVersion.Length)
Console.WriteLine("[Removed]\t" + oldVersion[oldIndex++]);
while (newIndex < newVersion.Length)
Console.WriteLine("[Added]\t\t" + newVersion[newIndex++]);
或者,您需要遍历一个数组,并且对于此数组中的每个元素,执行另一个数组的单个传递以查找匹配项。
编辑:JP对如何使用框架执行此操作提出了很好的建议。虽然,假设数组已排序,我的方法的好处是你只需要进行一次传递即可找到所有结果。你不必三次通过。
答案 3 :(得分:1)
我写了一会儿:
用法:
foreach (var diff in Foo_Old.Diff(Foo_New)){
Console.WriteLine ("{0} action performed on {1}",diff.DiffAction,diff.Value);
}
实施:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace LinqExtensions {
enum DiffAction {
Added,
Removed,
Same
}
class DiffPair<T> {
public T Value { get; set; }
public DiffAction DiffAction { get; set; }
}
static class DiffExtension {
public static IEnumerable<DiffPair<T>> Diff<T>
(
this IEnumerable<T> original,
IEnumerable<T> target
) {
Dictionary<T, DiffAction> results = new Dictionary<T, DiffAction>();
foreach (var item in original) {
results[item] = DiffAction.Removed;
}
foreach (var item in target) {
if (results.ContainsKey(item)) {
results[item] = DiffAction.Same;
} else {
results[item] = DiffAction.Added;
}
}
return results.Select(
pair => new DiffPair<T> {
Value=pair.Key,
DiffAction = pair.Value
});
}
}
}