Question

我有一个包含两列的数据框，例如：

> df
Col1     Col2
1,2,3    1
2,3      2
3        3
4        3
4,5      123
12       0
32       0

> class(df$Col1)
[1] "list"
> class(df$Col2)
[1] "integer"

>dput(df)
structure(list(Col1 = list(c(1,2,3), c(2,3), 3, 4, c(4,5), 12, 32), 
               Col2 = c(1L, 2L, 3L, 3L, 123L, 0L, 0L),..., class=data.frame)

如何仅使用Col1包含2或3的行创建df的子集？ IE：

> dfss
Col1    Col2
1,2,3   1
2,3     2
3       3

谢谢。

Answer 1

> df[ sapply(df$Col1,function(x) any(2:3 %in% x)) , ]
     Col1 Col2
1 1, 2, 3    1
2    2, 3    2
3       3    3

Answer 2

dfss <- df[ grepl("2" , df[ ,1])| grepl("3" , df[ ,1]) , ]

Answer 3

简而言之：如果您正在处理字符串，请查看subset和grepl。

以下是我如何读取数据并假设值类型为字符串，否则您可以删除stringsAsFactors并将其强制转换为您想要的任何类型。

> df <- read.table(text="
+ 1,2,3    1
+ 2,3      2
+ 3        3
+ 4        3
+ 4,5      123", col.names = c("Col1", "Col2"), stringsAsFactors=FALSE)
> # This is the key
> result <- subset(df, grepl("2", Col1) | grepl("3", Col1)) 
> df
Col1 Col2
1 1,2,3    1
2   2,3    2
3     3    3
4     4    3
5   4,5  123
> result
Col1 Col2
1 1,2,3    1
2   2,3    2
3     3    3

通过R中类列表的列的真值测试在数据帧中查找行

3 个答案: