有人可以告诉我如何从以下产品名称中提取“型号名称”。作为一个例子,我需要的是从“Bosch SGS45A08GB Silver Dishwasher”中提取“SGS45A08GB”。好像我必须创建Regex来识别具有给定字符串的Alphanumric值的单词。有人可以给我一些c#示例来完成这项工作。
一些带有模型名称的示例字符串:
Bosch SGS45A08GB Silver Dishwasher
Bosch Avantixx SGS45A02GB Dishwasher, White
Bosch SMS53E12GB White Dishwasher
Bosch SGS45A08GB Dishwashers
BOSCH SGI45E15E Full-size Semi-Integrated Dishwasher
Bosch SKS60E02GB Compact Dishwasher, White
BOSCH SRV43M03GB Slimline Integrated Dishwasher
BOSCH Classixx SGS45C12GB Full-size Dishwasher - White
BOSCH SGS45A02GB Dishwashers
Bosch 18V Cordless Drill Driver
Bosch PSB 18V Li-Ion Hammer Drill
Bosch SGS45A08GB Dishwasher
Bosch SGS45A08 12Place Full Size Dishwasher in Silver
编辑:添加更多产品名称
Hitachi DH24DVC 4kg Cordless SDS Plus Hammer Drill 24V
DeWalt DW965K 12V Angled Drill Driver
Grove Modern Bathroom Suite with Acrylic Bath
Bosch GBH24V 3.2kg SDS Plus Drill 24V
Makita LS0714/1 190mm Sliding Compound Mitre Saw 110V
Grove Modern Bathroom Suite with Steel Bath
Swann All-in-One Monitoring & Recording Kit with LCD
Makita BHR202RFE LXT 3.2kg SDS+ Rotary Hammer Drill 18V
DeWalt DW625EK-GB 2000W Router 240V
Trade Triple-Extension Ladder ELT340
Makita 6391DWPE3 18V Drill Driver
Erbauer ERF298MSW 165mm Sliding Compound Mitre Saw 24V
答案 0 :(得分:3)
如果将“alphanumeric”定义为包含ASCII大写字母和数字的字符串,并且假设模型名称的最小长度(假设为8个字符),则可以使用示例中的所有名称进行匹配
Regex regexObj = new Regex(
@"\b # word boundary
(?=[A-Z]*[0-9]) # assert presence of at least one ASCII digit
(?=[0-9]*[A-Z]) # assert presence of at least one ASCII letter
[0-9A-Z]{8,} # match at least 8 characters
\b # until a word boundary",
RegexOptions.IgnorePatternWhitespace);
Match matchResults = regexObj.Match(subjectString);
while (matchResults.Success) {
// matched text: matchResults.Value
// match start: matchResults.Index
// match length: matchResults.Length
matchResults = matchResults.NextMatch();
}
我认为大写的ASCII字母和数字是模型名称的合理假设,但如果这不正确,您需要向我们展示更多示例。
修改强> 使用您的新示例,以下正则表达式可以正常工作,但约束变得越来越宽松,您可能永远找不到可靠地匹配所有可能的模型名称的正则表达式。
Regex regexObj = new Regex(
@"\b # word boundary
(?=\S*[0-9]) # assert presence of at least one ASCII digit
(?=\S*[A-Z]) # assert presence of at least one ASCII letter
[0-9A-Z/-]{6,} # match at least 6 characters
\b # until a word boundary",
RegexOptions.IgnorePatternWhitespace);
答案 1 :(得分:0)
老兄,这是我能做的最好的。请注意,某些项目没有任何型号:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication3 {
class Program {
static void Main(string[] args) {
string _data = @"Bosch SGS45A08GB Silver Dishwasher
Bosch Avantixx SGS45A02GB Dishwasher, White
Bosch SMS53E12GB White Dishwasher
Bosch SGS45A08GB Dishwashers
BOSCH SGI45E15E Full-size Semi-Integrated Dishwasher
Bosch SKS60E02GB Compact Dishwasher, White
BOSCH SRV43M03GB Slimline Integrated Dishwasher
BOSCH Classixx SGS45C12GB Full-size Dishwasher - White
BOSCH SGS45A02GB DishwashersBosch 18V Cordless Drill Driver
Bosch PSB 18V Li-Ion Hammer Drill
Bosch SGS45A08GB Dishwasher
Bosch SGS45A08 12Place Full Size Dishwasher in Silver";
Regex _expression = new Regex(@"\p{Lu}{3}\d+\w+\s+");
foreach (Match _match in _expression.Matches(_data)) {
Console.WriteLine(_match.Value);
}
Console.ReadKey();
}
}
}