只在字符串ruby中获取时间

时间:2016-06-17 00:39:20

标签: ruby regex

我正在抓取网站上的数据。这是我在Nokogiri解析Html时收到的字符串

"0:10\r\n              (+1)\r\n            "
"03:10\r\n              (+1)\r\n            "

我怎么才能得到“0:10”和“03:10”?

更新

matchgsub之间有什么不同?

谢谢!

3 个答案:

答案 0 :(得分:1)

您的正则表达式应仅匹配具有所需模式的字符串。

r = /
    \A                    # match beginning of string
    (                     # begin capture group 1
      \d+                 # match one or more digits
      :                   # match a colon
      \d{2}               # match two digits
    )                     # end capture group 1
    \r\n\s+\(\+1\)\r\n\s+ # match substring
    \z                    # match end of string
    /x                    # free spacing regex definition mode

"0:10\r\n              (+1)\r\n            "[r,1]
  #=> "0:10" 
"03:10\r\n              (+1)\r\n            "[r,1]
  #=> "03:10" 
"0:101\r\n              (+1)\r\n            "[r,1]
  #=> nil 
":10\r\n              (+1)\r\n            "[r,1]
  #=> nil 
"0:10 \r\n              (+1)\r\n            "[r,1]
  #=> nil 
"0:10\r\n              (+2)\r\n            "[r,1]
  #=> nil 
"0:10\r\n              (+1)\r\n         cat"[r,1]
  #=> nil 

根据字符串的变化方式,您的模式可能需要进行一些更改。例如,如果" + 1"在括号中可能是" +"如果是正数,则需要将\(\+1\)替换为\(\+\d+\)

答案 1 :(得分:0)

你应该使用正则表达式/\d{0,2}:\d{0,2}/ @ engineer14发布。它有效,这是证明:

console.log("0:10\r\n              (+1)\r\n            ".match(/\d{0,2}:\d{0,2}/)[0])
console.log("03:10\r\n              (+1)\r\n            ".match(/\d{0,2}:\d{0,2}/)[0])

说明:

/ <-- open regex
\d <-- look for digit
{0,2} <-- zero or more of them
: <-- look for a colon
\d <-- look for another digit
{0,2} <-- zero or more of them
/ <-- close regex

答案 2 :(得分:0)

你在哪个网站抓取?如果它是一个时区,那么+1可能很重要。