verbdata_bkp1[1:5,2:4]
V2 V3 V4
1.content Document Not Received~2 Document not received~2 <NA>
2.content Payment Ease~1 QR~1 <NA>
3.content Payment Receipt~2 Payment Receipt~2 Payment Ease~1
4.content Surrender~1 Product Returns~1 <NA>
5.content <NA> <NA> <NA>`
所以在第1行,我们有2&#34;文件未收到~2&#34;和2&#34;付款收据〜2&#34;在第3行中。这些应该只在行中出现一次。
答案 0 :(得分:0)
一种选择是循环遍历行,将元素转换为大写或小写,并检查duplicated
的重复项,并将重复值更改为NA
。
df1[-1] <- t(apply(df1[-1], 1, function(x)
x[NA^duplicated(toupper(x))*seq_along(x)]))
df1
# V1 V2 V3 V4
#1 1.content Document Not Received~2 <NA> <NA>
#2 2.content Payment Ease~1 QR~1 <NA>
#3 3.content Payment Receipt~2 <NA> Payment Ease~1
#4 4.content Surrender~1 Product Returns~1 <NA>
#5 5.content <NA> <NA> <NA>`
注意:我没有使用第一列值,因为它似乎是标识符列。
df1 <- structure(list(V1 = c("1.content", "2.content", "3.content",
"4.content", "5.content"), V2 = c("Document Not Received~2",
"Payment Ease~1", "Payment Receipt~2", "Surrender~1", "<NA>"),
V3 = c("Document not received~2", "QR~1", "Payment Receipt~2",
"Product Returns~1", "<NA>"), V4 = c("<NA>", "<NA>", "Payment Ease~1",
"<NA>", "<NA>`")), .Names = c("V1", "V2", "V3", "V4"),
class = "data.frame", row.names = c(NA, -5L))