我已设法删除列表中的大多数重复值,但我仍然有小写重复项,并且我想要删除列表中的空字符串值。
CategoriesList yield返回约1000条记录; noDuplicateCategories
将此数字减少到20,删除大部分重复项:
var CSVCategories = from line in File.ReadAllLines(path).Skip(1)
let columns = line.Split(',')
select new Category
{
Name = columns[9]
};
var CategoriesList = CSVCategories.ToList();
var noDuplicateCategories = CategoriesList.Distinct(new CategoryComparer()).ToList();
这是我的Equalitycomparer接口的对象类重写方法:
class CategoryComparer : IEqualityComparer<Category>
{
// Products are equal if their names and product numbers are equal.
public bool Equals(Category x, Category y)
{
//Check whether the compared objects reference the same data.
if (Object.ReferenceEquals(x, y)) return true;
//Check whether any of the compared objects is null.
if (Object.ReferenceEquals(x, null ) || Object.ReferenceEquals(y, null))
return false;
//Check whether the products' properties are equal.
return string.Compare(x.Name, y.Name, true) == 0;
}
// If Equals() returns true for a pair of objects
// then GetHashCode() must return the same value for these objects.
public int GetHashCode(Category category)
{
//Check whether the object is null
if (Object.ReferenceEquals(category, null)) return 0;
//Get hash code for the Name field if it is not null.
int hashCategoryName = category.Name == null ? 0 : category.Name.GetHashCode();
//Get hash code for the Code field.
int hashCategoryCode = category.Name.GetHashCode();
//Calculate the hash code for the product.
return hashCategoryName;
}
}
我需要更改此处以删除空字符串值并忽略大小写?
答案 0 :(得分:3)
如果您需要唯一的名称,为什么要处理Category
对象。您可以在将名称转换为类别之前准备名称:
var categories = File.ReadLines(path).Skip(1)
.Select(l => l.Split(new [] {','}, StringSplitOptions.RemoveEmptyEntries))
.Where(parts => parts.Length >= 10)
.Select(parts => parts[9].Trim())
.Distinct(StringComparer.InvariantCultureIgnoreCase)
.Select(s => new Category { Name = s });
当然,如果您非常确定文件中的数据是可靠的 - 没有空行,每行至少有10个部分,并且每个部分都没有空格,那么您可以简化查询
var categories = File.ReadLines(path).Skip(1)
.Select(l => l.Split(',')[9])
.Distinct(StringComparer.InvariantCultureIgnoreCase)
.Select(s => new Category { Name = s });
注意:使用ReadLines
代替ReadAllLines
,以避免将所有文件内容转储到内存数组中。