我有这样一个数据框:
a b c
--------------------------------
1 2011 mal ID9
2 2012 yesterday ID10
3 2010 misch ID10
4 1995 ship ID9
5 2008 se ID9
6 1998 falling ID10
7 2011 friend ID9
8 2011 use to be ID10
...
我想要删除ID9
和ID10
后缀。 ID9
和ID10
之前的字符串部分具有任意长度,因此我不知道先验。
对于可重现的示例,这是我的数据框:
z <- data.frame(a = c(1,2,3,4,5,6,7,8),
b = c(2011,2012,2010,1995,2008,1998,2011,2011),
c = c("mal ID9", "yesterday ID10", "misch ID10", "mal ID10", "se ID9", "falling ID10", "friend ID9", "use to be ID10"))
这是我想要的结果:
zz <- data.frame(a = c(1,2,3,4,5,6,7,8),
b = c(2011,2012,2010,1995,2008,1998,2011,2011),
c = c("mal", "yesterday", "misch", "mal", "se", "falling", "friend", "use to be"))
我该怎么做?
答案 0 :(得分:5)
这应该有效
z$c=gsub(" ID.*","",z$c)
答案 1 :(得分:1)
您可以尝试这样的事情:
z %>% mutate(c = gsub("\\sID\\d+$", "", c))
a b c
1 1 2011 mal
2 2 2012 yesterday
3 3 2010 misch
4 4 1995 mal
5 5 2008 se
6 6 1998 falling
7 7 2011 friend
8 8 2011 use to be