Question

我整天都在这里，我无法理解。我在下面的字符串中有一些Ruby代码，并且只想匹配带有代码的行以及代码的第一个注释（如果存在）。

# Some ignored comment.
1 + 1 # Simple math (this comment would be collected) # ignored 
# ignored

user = User.new
user.name = "Ryan" # Setting an attribute # Another ignored comment

这会抓住：

1. "1 + 1"
2. "Simple math"
1. "user = User.new"
2. nil
1. "user.name = "Ryan"
2. "Setting an attribute"

我正在使用/^\x20*(.+)\x20*(#\x20*.+\x20*){1}$/来匹配每一行，但它似乎不适用于所有代码。

Answer 1

Kobi的答案部分有效，但与最后没有评论的代码行不匹配。

遇到字符串插值时也会失败，例如：

str = "My name is #{first_name} #{last_name}" # first comment

...将被错误地匹配为：str = "My name is #{first_name}

您需要更全面的正则表达式。以下是一个想法：

/^[\t ]*([^#"'\r\n]("(\\"|[^"])*"|'(\\'|[^'])*'|[^#\n\r])*)(#([^#\r\n]*))?/

^[\t ]* - 领先的空白。
([^#"'\r\n]("(\\"|[^"])*"|'(\\'|[^'])*'|[^#\n\r])*) - 匹配一行代码。
细分：
- [^#"'\r\n] - 代码行中的第一个字符，以及......
- "(\\"|[^"])*" - 双引号字符串，或......
- '(\\'|[^'])*' - 单引号字符串，或......
- [^#\n\r] - 引号字符串之外的任何其他字符，不是#或行结尾。
(#([^#\r\n]*))? - 匹配代码行末尾的第一条评论（如果有）。

由于逻辑更复杂，每次匹配将捕获6个子模式。子模式1是代码，子模式6是注释，您可以忽略其他。

给出以下代码块：

# Some ignored comment.
1 + 1 # Simple math (this comment would be collected) # ignored 
# ignored

user = User.new
user.name = "Ryan #{last_name}" # Setting an attribute # Another ignored comment

以上正则表达式会产生以下结果（为简洁起见，我排除了子模式2,3,4,5）：

1。 1 + 1
6。 Simple math (this comment would be collected)
1。 user = User.new
6。
1。 user.name = "Ryan #{last_name}"
6。 Setting an attribute

演示：http://rubular.com/r/yKxEazjNPC

Answer 2

虽然潜在的问题非常困难，但您可以使用以下模式找到所需内容：

^[\t ]*[^\s#][^#\n\r]*#([^#\n\r]*)

其中包括：

[\t ]* - 领先的空间。
[^\s#] - 一个真正的角色。这应该与代码匹配。
[^#\n\r]* - 直到＃符号的字符。除了哈希或换行之外的任何东西。
#([^#\n\r]*) - 在第1组中捕获的“第一个”评论。

工作示例：http://rubular.com/r/wNJTMDV9Bw

用于查找注释的Ruby正则表达式？

2 个答案: