Question

我有以下数据：

预期结果:(示例：如果3列中的任何一列仅包含A，则结果应为1）

A   B   Result
0   0   0
A   0   1
0   B   2
A   B   3

我的数据集的所需输出应为：

V1  V2  V3  Result
A   0   0   1
0   A   0   1
0   A   0   1
0   A   B   3
0   0   A   1
B   B   A   3
B   B   0   2
B   0   A   3

有人可以帮助我，我们怎样才能在R中实现这一目标。

Answer 1

我假设您的原始数据是字符，因此，您可以将它们转换为一个因子，并利用R将因子映射到内部整数的事实。这些因子水平从1开始，所以你必须在最后调整输出，但这是一个如何做的例子：

# specify the order so that "0"=1L, "A"=2L, "B"=3L
levels <- c("0", "A", "B")

# sample data
df <- expand.grid(levels, levels, levels, stringsAsFactors = FALSE)

# substitute df with your data frame
columns_list <- lapply(df, function(column) {
  unclass(factor(column, levels = levels)) - 1L
})

foo <- function(...) {
  sum(unique(c(...)))
}

df$Result <- unlist(do.call(Map, c(list(f = foo), columns_list)))

head(df)

  Var1 Var2 Var3 Result
1    0    0    0      0
2    A    0    0      1
3    B    0    0      2
4    0    A    0      1
5    A    A    0      1
6    B    A    0      3

Answer 2

一种选择是将c("0","A","B")重新标记为c("0","1","2")，然后使用apply获取唯一的行数据总和。

df$Result <- apply(df, 1, function(x){
   sum(as.numeric(as.character(factor(unique(x),  levels = c("0","A","B"),
                               labels = c("0", "1", "2")))))
})

#Result

df
#   V1 V2 V3 Result
# 1  0  0  0      0
# 2  A  0  0      1
# 3  0  A  0      1
# 4  0  A  0      1
# 5  0  A  B      3
# 6  0  0  A      1
# 7  B  B  A      3
# 8  B  B  0      2
# 9  B  0  A      3

数据：

df <- read.table(text = "V1 V2 V3 0 0 0 A 0 0 0 A 0 0 A 0 0 A B 0 0 A B B A B B 0 B 0 A", header = TRUE, stringsAsFactors = FALSE)

如何根据R中的不同列值获取结果

2 个答案: