如何从R中的另一个向量中减去带有重复字符的完整字符向量

时间:2015-03-11 09:36:44

标签: r vector subset subtraction

我想从x中减去y,这意味着删除一个" A",三个" B"和一个" E"来自x,因此xNew将为c("A", "C", "A","B","D")。这也意味着

length(xNew)=length(x) - length(y)
x <- c("A","A","C","A","B","B","B","B","D","E")
y <- c("A","B","B","B","E")

setdiff不起作用,因为

xNew <- setdiff(x,y)
xNew 
[1] "C" "D"

匹配也不起作用

xNew <- x[-match(y,x)]
xNew
[1] "A" "C" "A" "B" "B" "B" "D"

它删除了&#34; B&#34;在第五个位置3次,所以还有三个&#34; B&#34;左

有人知道如何做到这一点,R中是否有可用的功能,或者我们应该写一个私有函数? 非常感谢。

1 个答案:

答案 0 :(得分:4)

您可以使用pmatch

功能
x[-pmatch(y,x)]
#[1] "A" "C" "A" "B" "D"

修改
如果您的数据可以是超过1个字符的字符串,则可以选择获取所需内容:

xNew <- unlist(sapply(x[!duplicated(x)], 
                      function(item, tab1, tab2) {
                          rep(item,
                              tab1[item] - ifelse(item %in% names(tab2), tab2[item], 0))
                       }, tab1=table(x), tab2=table(y)))

实施例

x <- c("AB","BA","C","CA","B","B","B","B","D","E")
y <- c("A","B","B","B","E")
xNew
#  AB   BA    C   CA    B    D 
#"AB" "BA"  "C" "CA"  "B"  "D"