所以我创建了这个正则表达式来解析这样的字符串(我需要Byte和Time的值):
1463735418 Bytes: 0 Time: 4.297
这是下面的代码(使用this)
string writePath = @"C:\final.txt";
string[] lines = File.ReadAllLines(@"C:\union.dat");
foreach (string txt in lines)
{
string re1 = ".*?"; // Non-greedy match on filler
string re2 = "\\d+"; // Uninteresting: int
string re3 = ".*?"; // Non-greedy match on filler
string re4 = "(\\d+)"; // Integer Number 1
string re5 = ".*?"; // Non-greedy match on filler
string re6 = "([+-]?\\d*\\.\\d+)(?![-+0-9\\.])"; // Float 1
Regex r = new Regex(re1 + re2 + re3 + re4 + re5 + re6, RegexOptions.IgnoreCase | RegexOptions.Singleline);
Match m = r.Match(txt);
if (m.Success)
{
String int1 = m.Groups[1].ToString();
String float1 = m.Groups[2].ToString();
Debug.Write("(" + int1.ToString() + ")" + "(" + float1.ToString() + ")" + "\n");
File.AppendAllText(writePath, int1.ToString() + ", " + float1.ToString() + Environment.NewLine);
}
}
当字符串表示为一行时,这非常有效,但是当我尝试读取这样的文件时。
1463735418
Bytes: 0
Time: 4.297
1463735424
Time: 2.205
1466413696
Time: 2.225
1466413699
1466413702
1466413705
1466413708
1466413711
1466413714
1466413717
1466413720
Bytes: 7037
Time: 59.320
... (arbritrary repition)
我收到了垃圾数据。
Expected Output:
0, 4.297
7037, 59.320
(仅在存在时间字节对的情况下匹配)
编辑:我正在尝试这样的事情,但我仍然没有得到理想的结果。
foreach (string txt in lines)
{
if (txt.StartsWith("Byte"))
{
string re1 = ".*?"; // Non-greedy match on filler
string re2 = "(\\d+)"; // Integer Number 1
Regex r = new Regex(re1 + re2, RegexOptions.IgnoreCase | RegexOptions.Singleline);
Match m = r.Match(txt);
if (m.Success)
{
String int1 = m.Groups[1].ToString();
//Console.Write("(" + int1.ToString() + ")" + "\n");
httpTable += int1.ToString() + ",";
}
}
if (txt.StartsWith("Time"))
{
string re3 = ".*?"; // Non-greedy match on filler
string re4 = "([+-]?\\d*\\.\\d+)(?![-+0-9\\.])"; // Float 1
Regex r1 = new Regex(re3 + re4, RegexOptions.IgnoreCase | RegexOptions.Singleline);
Match m1 = r1.Match(txt);
if (m1.Success)
{
String float1 = m1.Groups[1].ToString();
//Console.Write("(" + float1.ToString() + ")" + "\n");
httpTable += float1.ToString() + Environment.NewLine;
}
}
}
我该如何修补? 感谢。
答案 0 :(得分:2)
我建议lookbehind限定时间和字节,如果没有找到默认值到整数类别。然后使用正则表达式命名捕获确定每个匹配的内容。
string data = "1463735418 Bytes: 0 Time: 4.297 1463735424 Time: 2.205 1466413696 Time: 2.225 1466413699 1466413702 1466413705 1466413708 1466413711 1466413714 1466413717 1466413720 Bytes: 7037 Time: 59.320";
string pattern = @"
(?<=Bytes:\s)(?<Bytes>\d+) # Lookbehind for the bytes
| # Or
(?<=Time:\s)(?<Time>[\d.]+) # Lookbehind for time
| # Or
(?<Integer>\d+) # most likely its just an integer.
";
Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace)
.OfType<Match>()
.Select(mt => new
{
IsInteger = mt.Groups["Integer"].Success,
IsTime = mt.Groups["Time"].Success,
IsByte = mt.Groups["Bytes"].Success,
strMatch = mt.Groups[0].Value,
AsInt = mt.Groups["Integer"].Success ? int.Parse(mt.Groups["Integer"].Value) : -1,
AsByte = mt.Groups["Bytes"].Success ? int.Parse(mt.Groups["Bytes"].Value) : -1,
AsTime = mt.Groups["Time"].Success ? double.Parse(mt.Groups["Time"].Value) : -1.0,
})
这是一个结果,它是每个匹配的IEnumerable,作为一个动态实体,有三个IsA
s及其相应的As
转换值,如果可行:
答案 1 :(得分:0)
由于您只需要>> ?*
=> "*"
>> ?a
=> "a"
>> ?1
=> "1"
>> ?8
=> "8"
>> ?83
SyntaxError: (irb):32: syntax error, unexpected '?'
from /usr/local/bin/irb:11:in `<main>'
>> ?ab
SyntaxError: (irb):33: syntax error, unexpected '?'
from /usr/local/bin/irb:11:in `<main>'
和Bytes: ...
的值,请使用完全字符串,而不是填充符:
Time: ...
Bytes
Bytes: (\d+)
Time
Time: ([-+]\d*\.\d+)