我有一个包含邮件地址的非常混乱的数据集,我想提取相应的内容。将它们放在新列中:
adis_sep <- as.data.frame(matrix(c(1:3, "asdf@com.com", 5:7, "sdfg@com.com", 9, "qer@f.com", 11, 12), ncol=3, byrow = T))
adis_wo <- adis_sep %>% apply(2, function(x) grepl(".*@.*", x))
我设法在新列中获得了想要的元素的逻辑df,但是现在我陷入了困境。我知道我确实缺少一些明显的东西!所以请帮帮我。非常感谢!
答案 0 :(得分:0)
adis_sep%>%rowwise()%>%
mutate(new=c(V1,V2,V3)[grep("@",c(V1,V2,V3))[1]])
Source: local data frame [4 x 4]
Groups: <by row>
# A tibble: 4 x 4
V1 V2 V3 new
<chr> <chr> <chr> <chr>
1 1 2 3 NA
2 asdf@com.com 5 6 asdf@com.com
3 7 sdfg@com.com 9 sdfg@com.com
4 qer@f.com 11 12 qer@f.com
如果有多个列包含@
,则选择第一列。另外,在使用as.data.frame()
并添加stringsAsFactors = FALSE
时要小心,否则此代码将无效。
修改
对于第二种情况,
adis_sep%>%rowwise()%>%
mutate(new=c(Organisation,Kontaktperson,Mail,sonst1,sonst2)[grep("@",c(Organisation,Kontaktperson,Mail,sonst1,sonst2))[1]])
Source: local data frame [2 x 6]
Groups: <by row>
# A tibble: 2 x 6
Organisation Kontaktperson Mail sonst1 sonst2 new
<chr> <chr> <chr> <chr> <chr> <chr>
1 10 Jahre xx Familienferienwochen " x y" " adf@xx.ch" NA NA " adf@xx.ch"
2 50plus talk " adf adf" " führerin " " info@asdf.ch" NA " info@asdf.ch"