您好我正在使用包含字符串的List<T>
结果
- 为了简化它,让我使用这样的词,但方案是相同的
01:01 A car consists of : wheels, engine, seats, 2 screws, a cotton lamp
01:02 A bike consists of : wheels
01:03 A car consists of : wheels, engine, seats, speakers, 5 screws, an indicator light
01:04 A small truck consists of : wheels, engine, seats, bed
因此伪匹配器和所需的输出将是。
00-99:0-99(space)A|An(space){get the car/bike or any other as object}(space)consists(space)of(space):{get the elements in here exploding the commas as attributes}
现在我在foreach循环中使用,它通过我的列表然后将行写入文本框。
Foreach(Message _msg in _objects.Messages){
richTextBox1.AppendText(_msg.Text);
}
伪显示器,将整个句子添加到我的文本框中。
Foreach(Message _msg in _objects.Messages){
richTextBox1.AppendText(parsefunction(_msg.Text));
}
parse function
{
count(the elements exploaded , and list them)
remove the unwanted parts of text
}
提取对象和属性后,我想根据它们是否包含计数来对它们求和,并从中删除a /。这部分是我被困住的地方。
所需的输出是 - 对任何重复项和出现的数量求和
2x Car
4x Wheels
3x Engine
3x Seats
7x Screws
1x Cotton Lamp
1x Bike
1x Speakers
1x Indicator Light
1x Small Truck
1x Bed
你能指点我至少Regex
,也许我会自己计算其余部分,并在完成后分享。我认为它必须是一个将在循环中调用的函数。
答案 0 :(得分:1)
这是我想出的(我确信它可以改进):
public static List<KeyValuePair<string, string[]>> ParseData(List<string> data)
{
Regex regex = new Regex(@"^[\d]{2}:[\d]{2} A[n]? ([a-zA-Z\s]+) consists of : ([a-zA-Z,\s0-9]+)$");
var elementMap = new List<KeyValuePair<string, string[]>>();
for (int i = 0; i < data.Count; i++)
{
var match = regex.Match(data[i]);
var attributes = match.Groups[2].Value.Split(new string[] { ", " }, StringSplitOptions.RemoveEmptyEntries);
if (match.Success && match.Groups[1].Value.Length > 0)
elementMap.Add(new KeyValuePair<string, string[]>(match.Groups[1].Value, attributes));
}
return elementMap;
}
public static Dictionary<string, int> GetIndexedData(List<KeyValuePair<string, string[]>> data)
{
Dictionary<string, int> displayObjects = new Dictionary<string, int>();
foreach (KeyValuePair<string, string[]> item in data)
{
if (displayObjects.ContainsKey(item.Key))
displayObjects[item.Key]++;
else
displayObjects.Add(item.Key, 1);
foreach (string key2 in item.Value)
{
string[] attributeValues = key2.Split(' ');
int add = 1;
string addValue = key2;
int c = 0;
if (attributeValues.Length > 1 && int.TryParse(attributeValues[0], out c))
{
add = c;
addValue = attributeValues[1];
}
if (addValue.Substring(0, 2) == "a ")
addValue = addValue.Substring(2);
else if (addValue.Substring(0, 3) == "an ")
addValue = addValue.Substring(3);
if (displayObjects.ContainsKey(addValue))
displayObjects[addValue] += add;
else
displayObjects.Add(addValue, add);
}
}
return displayObjects;
}
使用:
List<string> data = new List<string>();
data.Add("01:01 A car consists of : wheels, engine, seats, 2 screws, a cotton lamp");
data.Add("01:02 A bike consists of : wheels");
data.Add("01:03 A car consists of : wheels, engine, seats, speakers, 5 screws, an indicator light");
data.Add("01:04 A small truck consists of : wheels, engine, seats, bed");
var elementMap = ParseData(data);
var displayObjects = GetIndexedData(elementMap);
foreach (string key in displayObjects.Keys)
{
Console.WriteLine(key + ": " + displayObjects[key]);
}
基本上;此Regex
模式(^[\d]{2}:[\d]{2} A[n]? ([a-zA-Z\s]+) consists of : ([a-zA-Z,\s0-9]+)$
)将匹配您指示的任何构建完全的内容。你所要做的就是:
var match = regex.Match(data[i]);
// 'match.Groups[1].Value' is the name of the item
// 'match.Groups[2].Value' is the comma-separated list
// The following line will split all the attributes on ', ' therefore leaving them as just the words. (`wheels`, `engine`, `seats`)
var attributes = match.Groups[2].Value.Split(new string[] { ", " }, StringSplitOptions.RemoveEmptyEntries);
使用所有这些信息做你想做的事。
这做出以下假设:
[\d]{2}
),冒号(:
)和另外两个数字([\d]{2}
),一个空格(
) ,a(A
)和可选的n([n]?
)(对于A
或An
)和另一个空格(
);所有这一切都在行的最开始(^
)object
(([a-zA-Z\s]+)
的名称可以包含:
a-z
,A-Z
)\s
)
),consists of
,空格(
)和冒号(:
)。attributes
(([a-zA-Z,\s0-9]+)
)的字词可以包含:
a-z
,A-Z
),
)\s
)0-9
)$
)最后,假设attributes
不是null
或nothing
- attributes
中有至少一个字符。< / p>
此外,此处还有否错误检查。你应该根据需要添加它。