我有一个字符串列表,例如:
{ abc001, abc002, abc003, cdef001, cdef002, cdef004, ghi002, ghi001 }
我想获得所有常见的唯一前缀;例如,对于上面的列表:
{ abc, cdef, ghi }
我该怎么做?
答案 0 :(得分:2)
var list = new List<String> {
"abc001", "abc002", "abc003", "cdef001",
"cdef002", "cdef004", "ghi002", "ghi001"
};
var prefixes = list.Select(x = >Regex.Match(x, @"^[^\d]+").Value).Distinct();
答案 1 :(得分:0)
您可以使用正则表达式选择文本部分,然后使用HashSet<string>
添加该文本部分,以便不添加重复项:
using System.Text.RegularExpressions;
//simulate your real list
List<string> myList = new List<string>(new string[] { "abc001", "abc002", "cdef001" });
string pattern = @"^(\D*)\d+$";
// \D* any non digit characters, and \d+ means followed by at least one digit,
// Note if you want also to capture string like "abc" alone without followed by numbers
// then the pattern will be "^(\D*)$"
Regex regex = new Regex(pattern);
HashSet<string> matchesStrings = new HashSet<string>();
foreach (string item in myList)
{
var match = regex.Match(item);
if (match.Groups.Count > 1)
{
matchesString.Add(match.Groups[1].Value);
}
}
结果:
abc, cde
答案 2 :(得分:0)
编写辅助类来表示数据可能是个好主意。例如:
public class PrefixedNumber
{
private static Regex parser = new Regex(@"^(\p{L}+)(\d+)$");
public PrefixedNumber(string source) // you may want a static Parse method.
{
Match parsed = parser.Match(source); // think about an error here when it doesn't match
Prefix = parsed.Groups[1].Value;
Index = parsed.Groups[2].Value;
}
public string Prefix { get; set; }
public string Index { get; set; }
}
你需要提出一个更好的名字,当然还有更好的访问修饰符。
现在任务非常简单:
List<string> data = new List<string> { "abc001", "abc002", "abc003", "cdef001",
"cdef002", "cdef004", "ghi002", "ghi001" };
var groups = data.Select(str => new PrefixedNumber(str))
.GroupBy(prefixed => prefixed.Prefix);
结果是所有数据,解析并按前缀分组。
答案 3 :(得分:0)
假设您的前缀是所有字母字符并由第一个非字母字符终止,您可以使用以下LINQ表达式
List<string> listOfStrings = new List<String>()
{ "abc001d", "abc002", "abc003", "cdef001", "cdef002", "cdef004", "ghi002", "ghi001" };
var prefixes = (from s in listOfStrings
select new string(s.TakeWhile(c => char.IsLetter(c)).ToArray())).Distinct();