Question

我想知道是否存在一种聪明的方法来计算一列中以及整个行中用分号分隔的值的出现。

样本数据和预期输出

sample data and output expected

Answer 1

这是一种tidyverse的方法：

library(tidyverse)

# example data
df1 = data.frame(var1 = c(2,4,3,5),
                 var2 = c("3;5;2;0;1","2;3;8;5","9;6;2","8;5;4;7;0;1"),
                 stringsAsFactors = F)

df1 %>%
  separate_rows(var2) %>%         # split values to different rows
  filter(var2 %in% df1$var1) %>%  # keep values that match var1
  count(var2)                     # count each value

# # A tibble: 4 x 2
#   var2      n
#   <chr> <int>
# 1 2         3
# 2 3         2
# 3 4         1
# 4 5         3

以及基本的R方法：

v = unlist(strsplit(df1$var2, ";"))
data.frame(table(v[v %in% df1$var1]))

#   Var1 Freq
# 1    2    3
# 2    3    2
# 3    4    1
# 4    5    3

是否有R函数来计算一列和跨行中用分号分隔的值

样本数据和预期输出

1 个答案: