Question

我尝试在Python中使用Regex过滤cpu模型和cpu频率以下的CPU信息。

Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
Genuine Intel(R) CPU T2400 @ 1.83GHz

到目前为止，我的目标是，但仍然很难过滤出第二个。

(?(?=.*\sCPU\s@)([a-zA-Z]\d+-\d+[a-zA-Z]+)|\d+.\d+GHz)

我在输出中寻找类似的东西：

i5-2520M  2.50GHz
Genuine T2400  1.83GHz

先谢谢大家

Answer 1

在此链接上，您可以播放/个性化：https://regex101.com/r/sr3zjR/1

(?x) # Free spacing mode, to allow comment and better view

# Matching the first line `i5-2520M`
([^ ]+\s*)(?=CPU\s*@)

# Matching the first line `2.50GHz`
|(?<=CPU)(\s*@\s*\d+.\d+GHz)

# Matching the second line `CPU T2400`
|(CPU\s*[^ ]+\s*)(?=@)

# Matching the second line `1.83GHz`
|\s*(?<=@)(\s*\d+.\d+GHz)

由于正则表达式的性质，我们不能跳过/跳过正则表达式序列，这就是我们需要为每个捕获组使用|运算符创建多个匹配的原因。因此，您可以查看其他问题以获得更多信息：Regular expression to skip character in capture group

这些是经过的黄金地方：

Answer 2

这个答案与我发布的第一个答案有所不同。在这里，我尝试准确匹配问题上匹配的内容。

这是此答案的新实时链接：here

(?x) # Free spacing mode, to allow comment and better view

# Matching the first line `i5-2520M`                (capture group 1)
([^ ]+\s*)(?=CPU\s*@)

# Matching the first line `@ 2.50GHz`               (capture group 2)
|(?<=CPU)(\s*@\s*\d+.\d+GHz)

# Matching the `first word` on the second line.     (capture group 3)
# The `\s*$` is used to not match empty lines.
|(^[^ ]+)(?!(?:.*CPU\s*@)|\s*$) 

# Matching the second line `CPU T2400`              (capture group 4)
|(?<=CPU)(\s*[^ ]+\s*)(?=@)

# Matching the second line `1.83GHz`                (capture group 5)
|\s*(?<=@)(\s*\d+.\d+GHz)

在另一个答案中，每个捕获组都包含一个所需的元素，因此您可以通过捕获组索引引用它们来单独操作它们中的每一个。

在第2组中，有一个技巧我匹配@以允许它与它之前的单词之间无限空格，因为positive look-behind (?<=)不允许使用* 1}}运算符。如果感兴趣的是与@不匹配，您可以将第二组表达式更改为此下方：

https://regex101.com/r/sr3zjR/3

# Matching the first line `2.50GHz`                 (capture group 2)
|(?<=CPU\s@)(\s*\d+.\d+GHz)

这是此次更改的新实时链接：

正如在这个答案的其他地方一样，我们处于自由间隔模式。此外，我们需要使用white-space转义\或仅使用\s。

正则表达式过滤掉CPU信息（Python）

2 个答案: