我有一个从csv文件中获取的以下列表...
one,bob,black
two,steve,smith
three,bill,brown
one,jill,brown
one,sue,smith
每一行都是列表中的一个字符串......
one,bob,black
two,steve,smith
three,bill,brown
我想根据每行的第一个值删除重复项。导致......
distinctlist = Select listOfRecords.split(',')[0].distinct
我认为代码看起来像......
Fri -> EEE (indicate shot day name)
Nov -> MMM (indicate month)
06 -> HH (indicate two digit date)
01 -> mm (indicate two digit month)
00 -> ss (indicate two digit minute)
IST -> IST (indicate current time standerd)
2015-> yyyy (indicate years)
这显然是错误的,但我想避免列出列表并按此方式执行。思考linq会更简单。
我在这里找到的所有帖子看起来都很复杂,或者没有解决我问题的具体细节。任何帮助将不胜感激......
答案 0 :(得分:1)
简单使用GroupBy
:
var distinctByFirstColumn = listOfRecords
.GroupBy(x => x.Split(',')[0])
.Select(x => x.First());
答案 1 :(得分:1)
我宁愿在这里使用HashSet<String>
和简单foreach
循环而不是 Linq (恕我直言,这是过冲):
var distinctList = new List<String>();
HashSet<String> taken = new HashSet<String>();
foreach (var line in listOfRecords)
// you don't want to split all the line, but 1st item only
if (taken.Add(line.SubString(0, line.IndexOf(',')))
distinctList.Add(line);
修改:如果是真实csv文件:
private static IEnumerable<String> CsvDistinctLines(String fileName) {
HashSet<String> taken = new HashSet<String>();
foreach (var line in File.ReadLines(fileName))
if (taken.Add(line.SubString(0, line.IndexOf(',')))
yield return line;
}
...
var distinctList = CsvDistinctLines(@"C:\MyFile.csv").ToList();