阐明街道名称的方向

时间:2018-10-01 19:58:40

标签: r regex

我尝试使用Stringer软件包纠正数据框的街道名称,拼写为“ S”。到“南方”或“ E”到“东方”以及“ st”。到“街道”。示例数据如下。

df = data.frame(street = c('333 S. HOPE STREET', '21 South Hope Street', '54 Hope PKWY', '60C/O St.'))

这是我的代码。

  df2 <- df %>% mutate(street2 = str_replace(street, 'S', "South"),
                     street2 = str_replace_all(street2, 'PKWY', "PARKWAY"),
                     street2 = str_replace_all(street2, 'st.', "Street"))

它返回以下结果。

street              street2

333 S. HOPE STREET     333 South. HOPE STREET
21 South Hope Street   21 Southouth Hope Street
54 Hope PKWY           54 Hope PARKWAY
60C/O St.              60C/O Southt.

这是我想要的结果。不知道我在哪里弄错了。

street              street2

333 S. HOPE STREET     333 South HOPE STREET
21 South Hope Street   21 South Hope Street
54 Hope PKWY           54 Hope PARKWAY
60C/O St.              60C/O Sreet.

1 个答案:

答案 0 :(得分:2)

别忘了逃脱这些点!在正则表达式模式中,.匹配(几乎)任何字符。如果您的意思是文字点,则必须使用\对点进行转义(也必须使用另一个\进行转义)。

所以:

df %>% mutate(street2 = str_replace(street, 'S\\.', "South"),
                     street2 = str_replace_all(street2, 'PKWY', "PARKWAY"),
                     street2 = str_replace_all(street2, 'St\\.', "Street"))

将导致

#                 street               street2
# 1   333 S. HOPE STREET 333 South HOPE STREET
# 2 21 South Hope Street  21 South Hope Street
# 3         54 Hope PKWY       54 Hope PARKWAY
# 4            60C/O St.          60C/O Street

为了获得更好的可读性,您可以使用stringr::str_to_title

df %>% mutate(street2 = str_replace(street, 'S\\.', "South"),
              street2 = str_replace_all(street2, 'PKWY', "PARKWAY"),
              street2 = str_replace_all(street2, 'St\\.', "Street") ) %>%
  mutate_all( ., str_to_title )

#                 street               street2
# 1   333 S. Hope Street 333 South Hope Street
# 2 21 South Hope Street  21 South Hope Street
# 3         54 Hope Pkwy       54 Hope Parkway
# 4            60c/O St.          60c/O Street