Question

我正在尝试将一些文本文件解析为数据库，并且有一个字符串，其中包含2条信息。字符串可以是什么样的选项。它可以看起来像一个单词Word，也可以有第一个单词，后跟短划线，后跟任意数量的其他单词，如Word - Second。关键是，如果字符串以Word - Second 4之类的数字结尾，或者两个数字以Word - Second 2/3之类的斜杠分隔，则需要将这些数字放入不同的变量中。

我不太了解正则表达式做这个。救命？（有解释？）

Answer 1

我想你可能正在寻找这样的东西：

^([a-zA-Z]+(?: *- *[a-zA-Z]+(?: +[a-zA-Z]+)*)?)(?: +(\d+(?:\/\d+)?))?$

说明：

^               Start of line
(               First capturing group (for the words)
  [a-zA-Z]+     A word
  (?:...)?      (Omitted for clarity)
)               Close first group
(?:             Start non-capturing group
  \s+           Some whitespace
  (             Second capturing group (for the numbers)
    \d+         A number
    (?:\/\d+)?  Optionally a slash followed by another number
  )             Close capturing group
)?              Close optional non-capturing group
$               End of line

我省略了对上述部分的解释：(?: *- *[a-zA-Z]+(?: +[a-zA-Z]+)*)?。它匹配破折号，后跟一个或多个空格分隔的单词。我还在解释中写了\s而不是，因为空间是不可见的。但\s匹配任何空格，包括新行。您可能更喜欢只匹配空格。

Rubular

正则表达式找出一个复杂的字符串

1 个答案: