Question

我想删除所有名称前缀。（例如教授，博士，先生等），任何序列都可以不止一个。所以我想写一个正则表达式slice所有这些前缀。我想在ruby中执行此操作。

以下是我想要实现的输入/输出设置。

"Prof. Dr. John Doe" => "John Doe"
"Dr. Prin. Gloria Smith" => "Gloria Smith"
"Dr. William" => "William"
"Sean Paul" => "Sean Paul"

我还想将删除的前缀存储在另一个字符串中。

"Prof. Dr. John Doe" => "Prof. Dr."
"Dr. Prin. Gloria Smith" => "Dr. Prin."
"Dr. William" => "Dr."
"Sean Paul" => ""

Answer 1

案例1：标题列表

假设

titles = ["Dr.", "Prof.", "Mr.", "Mrs.", "Ms.", "Her Worship", "The Grand Poobah"]

R = /
    (?:   # begin non-capture group
      #{Regexp.union(titles)}
          # "or" all the titles
      \s* # match >= 0 spaces
    )*    # end non-capture group and perform >= 0 times
    /x    # free-spacing regex definition mode
  #=> /
  #   (?:   # begin non-capture group
  #     (?-mix:Dr\.|Prof\.|Mr\.|Mrs\.|Ms\.|Her\ Worship|The\ Grand\ Poobah)
  #         # "or" all the titles
  #     \s* # match >= 0 spaces
  #   )*    # end non-capture group and perform >= 0 times
  #  /x 

def extract_titles(str)
  t = str[R] || ''
  [str[t.size..-1], t.rstrip] 
end

["Prof. Dr. John J. Doe, Jr.", "Dr. Prin. Gloria Smith", "The Grand Poobah Dr. No",
  "Gloria Smith", "Cher, Ph.D."].each { |s| p extract_titles s }
  # ["John J. Doe, Jr.", "Prof. Dr."]
  # ["Prin. Gloria Smith", "Dr."]
  # ["No", "The Grand Poobah Dr."]
  # ["Gloria Smith", ""]
  # ["Cher, Ph.D.", ""]

如果没有标题，如前两个示例所示str[R] => nil，那么(str[R] || "").rstrip #=> "".rstrip #=> ""。

请参阅类方法Regexp::union的doc，了解它是如何使用的。

案例2：没有标题列表

以下假设所有标题都是以大写字母开头的单个单词，后跟一个或多个小写字母，后跟一个句点。如果这不正确，可以相应地更改下面的正则表达式。

这种情况与前一种情况的唯一区别在于正则表达式发生了变化。

R = /
    \A       # match beginning of string
    (?:      # start a non-capture group
      [A-Z]  # match a capital letter
      [a-z]+ # match > 0 lower-case letters
      \.\s*  # match a period followed by >= 0 spaces
    )*       # end non-capture group and execute >= 0 times
    /x       # free-spacing regex definition mode

["Prof. Dr. John J. Doe, Jr.", "Dr.Prin.Gloria Smith",
 "Gloria Smith", "Cher, Ph.D."].each { |s| p extract_titles(s) }
  # ["John J. Doe, Jr.", "Prof. Dr."]
  # ["Gloria Smith", "Dr. Prin."]
  # ["Gloria Smith", ""]
  # ["Cher, Ph.D.", ""]

Answer 2

假设前缀只有updateTextInput(session,"service", value="")，Prof.，Dr.，Mr.，Mrs.，Prin.，您可以尝试：

Ms.

第二个问题（想要将删除的前缀存储在另一个字符串中）

s = "Prof. Dr. John Doe"
s.gsub(/Prof.|Dr.|Mr.|Mrs.|Prin.|Ms./, "").strip

Answer 3

因为你要求使用正则表达式：

dp = dot(C{1}(1:end-1,:), C{1}(2:end,:), 2);

这将导致：

cellfun

这将与是否与期间（博士或博士）相匹配此外，＆＃39; i＆＃39;在最后将使它匹配小写＆＃39; dr＆＃39;和＆＃39;教授＆＃39;。

Answer 4

使用此代码：

"Dr. Prin. Gloria Smith".split(". ").last
"Prof. Dr. John Doe".split(". ").last

Answer 5

如果前缀后面总是有一个点（。），那么你可以使用下面的逻辑

s = "Prof. Dr. John Doe"
match = s.match(/([\w\s\.]+\.)?\s*([\w\s]+)/)
prefix = match[1]
name = match[2]

OR

如果你有一本所有前缀的字典

s = "Prof. Dr. John Doe"
dictionary = ['Prof\.', 'Dr\.', 'Mr\.', 'Mrs\.', 'Prin\.'].join('|\s*')
match = s.match(/((?:#{dictionary})*)\s*([\w\s\.]+)/)
prefix = match[1]
name = match[2]

正如你在上面的数组（字典）中看到的那样，前缀有点（。）转义，因为正则表达式中的点（。）有不同的含义，即它的元字符代表任何字符http://www.regular-expressions.info/dot.html

多个单词的ruby正则表达式有条件匹配

5 个答案: