我有一个由字符串(df
)和字符串列表组成的数据帧character
。我们来看下面的例子:
Name Nationality
'Alice' "USA"
'Bob' "MEX"
'Eve' c("USA", "MEX")
那是:
> dput(df)
structure(list(Name = c("Alice", "Bob", "Eve"), Nationality = list( "USA", "MEX", c("USA", "MEX"))), .Names = c("Name", "Nationality"), row.names = c(1L, 2L, 3L), class = "data.frame")
如何提取至少“MEX”作为国籍的所有行?
预期产出:
Name Nationality
'Bob' "MEX"
'Eve' c("USA", "MEX")
编辑:我试过了:
df[df$Nationality == "MEX", ]
,但它只返回Bob。df[df$Nationality %in% "MEX",]
但只返回Bob。 (同意... %in% c("MEX"),]
)df["MEX" %in% df$Nationality,]
会返回所有值,就像df[is.element("MEX", df$Nationality),]
只有df[grep("MEX", df$Nationality), ]
正在运作......
答案 0 :(得分:1)
“国籍”列是长度为3的list
。因此,我们可以循环显示列以检查any
“MEX”%in%
list
}元素获取逻辑vector
并基于该
df[sapply(lapply(df$Nationality, `%in%`, "MEX"), any),]
# Name Nationality
#2 Bob MEX
#3 Eve USA, MEX
它也可以简化为
df[sapply(df$Nationality, function(x) "MEX" %in% x),]
答案 1 :(得分:0)
df["MEX" %in% df$Nationality,]
或
df[is.element("MEX", df$Nationality),]