鉴于以下产品名称。我的任务是提取所有颜色和尺寸。
示例:Nike Relay Women's Running Capris - **Black**, **L/XS**
Color = Black
Size = [XS,L]
最好的方法是什么?我想的是拥有dictionary
所有颜色和大小,然后只是做一场比赛。
但必须有更好的方法和更可维护的方式。 我看到的最大问题是有很多不同的组合
答案 0 :(得分:3)
这是时间,但服务于目的,整个想法是你必须有一个List
/ Collection
可用的colors
和sizes
,然后迭代它们一个一个检查
enum ColorBase {
[Description("Blue")] //by using System.ComponentModel;
Blue,
[Description("White")]
White,
[Description("Grey")]
Grey,
[Description("Magenta")]
Magenta,
[Description("Pale")]
Pale,
[Description("MaryTime Navy")]
MaryTimeNavy,
[Description("Navy")]
Navy,
[Description("Bluestone")]
Bluestone,
}
enum SizeBase
{
[Description("XL")]
XL,
[Description("XXL")]
XXL,
[Description("L")]
L,
[Description("M")]
M,
[Description("S")]
S,
[Description("XS")]
XS,
[Description("3X30")]
S30X30,
[Description("36X30")]
S36X30,
[Description("33X32")]
S33X32
}
使用System.Reflection
的辅助方法,它会返回上面声明的Description
enum
public static string GetEnumDescription(Enum value)
{
FieldInfo fi = value.GetType().GetField(value.ToString());
DescriptionAttribute[] attributes =
(DescriptionAttribute[])fi.GetCustomAttributes(
typeof(DescriptionAttribute),
false);
if (attributes != null &&
attributes.Length > 0)
return attributes[0].Description;
else
return value.ToString();
}
以下是对它们的访问: -
static void Main(string[] args)
{
List<string> availableColorsAndSizes = new List<string>();
string item = string.Empty;
StringBuilder mediator = new StringBuilder();
List<string> capries = new List<string>{"Nautica S Blue Bone Woven Pajama Pants",
"Nike Relay Women's Running Capris - Black, XS",
"Nautica Mens J-Class Pajama Pants-Small, NAVY",
"Nautica J-Class Woven Pajama Pant L, Maritime Navy",
"Nike Legend Tank - Womens - Black/Black",
"Nike 3PK DF Cushion No Show Tab Socks - Womens - Black/White/Black",
"Stance Casual Socks - Men's Mahalo, L/XL",
"Nautica Wrinkle Resistant Dress Pant 30x30, Grey",
"Nautica Wrinkle Resistant Dress Pant 36x30, Black",
"Nautica Wrinkle Resistant Dress Pant 33x32, Black",
"RVCA VA Flipped Box Slim T-Shirt - Short-Sleeve - Men's Bluestone, L",
"RVCA VA Flipped Box Slim T-Shirt - Short-Sleeve - Men's Bluestone, M",
"RVCA VA Flipped Box Slim T-Shirt - Short-Sleeve - Men's Bluestone, S",
};
foreach (var caprie in capries)
{
string[] words = caprie.Split(); //added this for WORD level precison
foreach (ColorBase colorBase in Enum.GetValues(typeof(ColorBase)))
{
item = Program.GetEnumDescription(colorBase);
if (caprie.Contains(item))
if (!mediator.ToString().Contains(item + ":"))//just to confirm that it's not being added to the same twice
mediator.Append(item + ":");
}
foreach (SizeBase sizeBase in Enum.GetValues(typeof(SizeBase)))
{
item = Program.GetEnumDescription(sizeBase);
if (caprie.Contains(item))
if (!mediator.ToString().Contains(item + ":"))
mediator.Append(item);
}
mediator.Append("|"); //identifies a pair of 'Color' and 'Size'
}
Console.WriteLine("Availabe Parameters");
string[] colorsAndSizes = mediator.ToString().Split('|');
foreach (var clrSiz in colorsAndSizes)
{
Console.Write("Color : {0}", clrSiz.Split(':')[0]);
if(clrSiz.Split(':').Length > 1)
Console.Write(" ,Size : {0}", clrSiz.Split(':')[1]);
Console.WriteLine();
}
}
答案 1 :(得分:2)
我会做一个分层的正则表达式构建。我已经创建了这样一个效果很好的系统,尽管它用于日志解析。
//basic definitions:
String colorsRegex = "(?black|red|blue|orange|navy|cyan|white)";
String sizesRegex = "(?small|large|medium)";
String sizesShortRegex = "(?s|m|l|xl|xxl|xxxl)";
// some more complex definitions
// always start the array with the most complex regex, so that as much is captured as possible ("blue-green" instead of just "blue")
String[] colorFinders = {"("+colorsRegex+"[/- ]+)+", colorsRegex};
String[] sizesFinders = {"("+sizesRegex+"[/- ]+)+", "("+sizesShortRegex+"[/- ]+){2,}", sizesRegex};
// match the string for each complex definition
对于此系统未匹配(或正确匹配)的每一行,构建一个专用的“查找程序”。重复,直到匹配所有数据。
注意无效的交叉匹配。在测试和生产环境中记录不匹配的行。记得要注意部分匹配并排除可能会混淆你的算法的字符串的任何部分(想象一个名为“蓝月亮”的公司,它总是会匹配)。