Question

我有一个CSV文件，其中两列包含每个单元格的一个或多个整数。

df <- data.frame(x=c("a","b","a","b"), 
y=c("datatype 1","datatype 1","datatype 2", "datatype 2"), 
z=c("2,3", "1,2","1,2,3,4,5", "3"))

names(df) <- c("hypothesis", "type", "mass") 

> df
  hypothesis       type      mass
1          a datatype 1       2,3
2          b datatype 1       1,2
3          a datatype 2 1,2,3,4,5
4          b datatype 2         3

我想从.csv中提取那些整数作为向量，并在我的代码中将它们分配给变量x（数据类型1，假设a）和y（数据类型2，假设a）。

现在，我正在使用subset按“数据类型”（第2列）和which（“假设”/第1列）过滤表格，以获得相应的“质量”值I需要。在下一步中，我想使用intersect找出哪些元素由x和y变量共享。

我的问题是，如何将vector中的.csv单元格内容（如“1,2,3”）加到intersect函数中？

当我调用单元格时，我得到typeof integer，当应用intersect时，结果为character(0)。当我手动分配x <- c(1,2,3,4,5); y <- c(2,3)时，结果是 - 应该是 - 2 3

Answer 1

我们可以通过'type'split'mass'，使用strsplit，unlist拆分字符串，转换为numeric，获取unique元素并应用intersect来查找list元素

中常见的元素

lst <- setNames(lapply(split(df$mass, df$type), function(x) 
       sort(unique(as.numeric(unlist(strsplit(as.character(x), ",")))))), c("x", "y"))

Reduce(intersect, lst)

从R中的LookUp-Table / CSV-File创建矢量

1 个答案: