我尝试使用Stringer软件包纠正数据框的街道名称,拼写为“ S”。到“南方”或“ E”到“东方”以及“ st”。到“街道”。示例数据如下。
df = data.frame(street = c('333 S. HOPE STREET', '21 South Hope Street', '54 Hope PKWY', '60C/O St.'))
这是我的代码。
df2 <- df %>% mutate(street2 = str_replace(street, 'S', "South"),
street2 = str_replace_all(street2, 'PKWY', "PARKWAY"),
street2 = str_replace_all(street2, 'st.', "Street"))
它返回以下结果。
street street2
333 S. HOPE STREET 333 South. HOPE STREET
21 South Hope Street 21 Southouth Hope Street
54 Hope PKWY 54 Hope PARKWAY
60C/O St. 60C/O Southt.
这是我想要的结果。不知道我在哪里弄错了。
street street2
333 S. HOPE STREET 333 South HOPE STREET
21 South Hope Street 21 South Hope Street
54 Hope PKWY 54 Hope PARKWAY
60C/O St. 60C/O Sreet.
答案 0 :(得分:2)
别忘了逃脱这些点!在正则表达式模式中,.
匹配(几乎)任何字符。如果您的意思是文字点,则必须使用\
对点进行转义(也必须使用另一个\
进行转义)。
所以:
df %>% mutate(street2 = str_replace(street, 'S\\.', "South"),
street2 = str_replace_all(street2, 'PKWY', "PARKWAY"),
street2 = str_replace_all(street2, 'St\\.', "Street"))
将导致
# street street2
# 1 333 S. HOPE STREET 333 South HOPE STREET
# 2 21 South Hope Street 21 South Hope Street
# 3 54 Hope PKWY 54 Hope PARKWAY
# 4 60C/O St. 60C/O Street
为了获得更好的可读性,您可以使用stringr::str_to_title
df %>% mutate(street2 = str_replace(street, 'S\\.', "South"),
street2 = str_replace_all(street2, 'PKWY', "PARKWAY"),
street2 = str_replace_all(street2, 'St\\.', "Street") ) %>%
mutate_all( ., str_to_title )
# street street2
# 1 333 S. Hope Street 333 South Hope Street
# 2 21 South Hope Street 21 South Hope Street
# 3 54 Hope Pkwy 54 Hope Parkway
# 4 60c/O St. 60c/O Street