删除双引号而不是单引号

时间:2019-02-19 18:21:02

标签: regex ruby

如何在Ruby中删除双引号而不是字符串中的单引号?例如从That's 'large'That's large

4 个答案:

答案 0 :(得分:4)

尝试此正则表达式:

\B'((?:(?!'\B)[\s\S])*)'

将每个匹配项替换为\1

Click for Demo

代码( Result ):

re = /\B'((?:(?!'\B)[\s\S])*)'/m
str = 'That\'s \'large\'
The 69\'ers\' drummer won\'t like this.
He said, \'it\'s clear this does not work\'. It does not fit the \'contractual obligations\''
subst = '\\1'

result = str.gsub(re, subst)

# Print the result of the substitution
puts result

说明:

  • \B-匹配非单词边界
  • ((?:(?!'\B)[\s\S])*)-匹配0+次出现的任何字符[\s\S](不以'开头,后跟非单词边界)。这是在第1组中捕获的。
  • '-匹配'

答案 1 :(得分:2)

这是解析用正则表达式无法完成的XML或HTML的绝技之一,但是您可以假装它可以正常工作。您可以永远对其进行调整,而不会变得正确。

您可以寻找平衡的报价,即成对的报价,但这无济于事。是将That's 'large'剥离为Thats large'还是That's large

相反,您需要了解英语语法,以及'apostrophe还是引号时。简单的东西,知道收缩和所有格的基本知识。宫缩:don'twon'tI'll。拥有权:Joe'ss'。也许您可以关闭正则表达式以跳过那些。

但是它很快变得复杂起来。 KO'd。或者,如果您想指示特定的发音,该怎么办:fo'c's'le。或某人的名字O'Doole

您可能 可以删除在单词的开头和结尾处的一对引号。 It's clear he said, 'this isn't a contraction'.this前面的引号与contraction末尾的引号匹配可能是安全的。

# Use negative look behind and ahead to look for quotes which are
# not after and before a word character.
# Use a non-greedy match to catch multiple pairs of quotes.
re = /(?<!\w)'(.*?)'(?!\w)/
sentence.gsub(re, '\1')

这在很多情况下都有效。

That's 'large' -> That's large
Eat at Joe's -> Eat at Joe's
I'll be Jane's -> I'll be Jane's
Jones' three cats' toys. -> Jones' three cats' toys.
It's clear he said, 'this isn't a contraction'. -> It's clear he said, this isn't a contraction.
'scare quotes' -> scare quotes
The 69'ers' drummer -> The 69'ers' drummer
Was She's success greater, or King Solomon's Mines's? -> Was She's success greater, or King Solomon's Mines's?
The 69'er's drummer and their 'contractual obligations'. -> The 69'er's drummer and their contractual obligations.
He said, 'it's clear this doesn't work'. -> He said, it's clear this doesn't work.

但并非总是如此。

His 'n' Hers's first track is called 'Joyriders'. -> His n Hers's first track is called Joyriders.

就像我说的那样,这是看起来很简单但非常复杂的问题之一,您永远无法做到正确。它会消耗很多时间。如果可能的话,我建议放弃该要求。

答案 2 :(得分:0)

稍有变化-如果单引号仅出现在单词字符周围,即字符a-z,A-Z,0-9或_(下划线)字符。您可以使用此:

phrase = "That's 'large' and not 'small', but it's still 'amazing'."
phrase.gsub(/'(\w*)'/, '\1')
=> "That's large and not small, but it's still amazing."

但是正如Schwern所说,如果您试图做一些简单的文本操作之外的事情,您很快就会发现自己被边缘情况所困扰。

答案 3 :(得分:-1)

通常很难确定单引号何时是一对单引号的一部分,以及它们在何处(例如用作撇号)。但是,如果我们假设不带对的单引号仅用作's(例如that's)和't(例如don't)的一部分,则可以简化问题

要删除''s之外的所有't字符,您可以像这样使用String#gsub!

"That's 'large'".gsub!(/\'(?![st])/i,"")  #=> "That's large"

这使用带有negative lookahead assertion的正则表达式来排除's,然后从匹配项(see explanation中排除t

就像我提到的那样,这不是解决删除单引号对的一般问题的解决方案,因为我认为这样做会不必要地复杂。这是一个简化问题的解决方案,您可以根据需要进行自定义。