使用多个正则表达式修改data.table内容

时间:2019-05-27 20:12:25

标签: r regex data.table

我有一个带有字符列的dt = data.table

我需要在该列上执行多个正则表达式操作,写为:

  dt[, Description := sapply(Description, tolower)][
      , Description := sapply(Description, gsub, pattern = " $", replacement = "")][
        , Description := sapply(Description, gsub, pattern = "  ", replacement = " ")][
          , Description := sapply(Description, gsub, pattern = "ões\\>", replacement = "ão")][
            , Description := sapply(Description, gsub, pattern = "eis\\>", replacement = "el")][
              , Description := sapply(Description, gsub, pattern = "as\\>", replacement = "a")][
                , Description := sapply(Description, gsub, pattern = "ais\\>", replacement = "al")][
                  , Description := sapply(Description, gsub, pattern = "es\\>", replacement = "e")][
                    , Description := sapply(Description, gsub, pattern = "ns\\>", replacement = "m")][
                      , Description := sapply(Description, gsub, pattern = "s\\>", replacement = "")]

基本上,这些都是将葡萄牙语中的复数更改为单数的所有方式。

是否有更有效,更优雅的方法?

0 个答案:

没有答案