Question

我有几个数据框：

df1: 
c1  (a,b,c) c2 (1,2,5) 

df2:

c1 (d,e,f)  c2 (4,7,10)

另一个数据框df3 c1: (1,3,7,9,11)（这将始终排序）

我需要df1中的新列和df2（df1,2名称将存储在循环变量中），这应该是df3中存在的最大元素，大于/等于c2，df1中的相应df2值。

例如，对于df1，c3将是（1,3,7）

如果数据框名称是变量，如何添加新列？
获取min(which df3$c1>= df1$c2)的矢量化版本？

我认为R无法正确地矢量化第二个公式，因为它有两个不同长度的数组。

Answer 1

不确定这是否有帮助：

df1 <- data.frame(c1=letters[1:3], c2=c(1,2,5), stringsAsFactors=F)
df2 <- data.frame(c1=letters[4:6],c2=c(4,7,10), stringsAsFactors=F)
df3 <- data.frame(c1=c(1,3,7,9,11))


df1$newCol <- apply(Vectorize(function(x) x>=df1$c2)(df3$c1),1, function(i) min(df3$c1[i]))

 df1
 #   c1 c2 newCol
 # 1  a  1      1
 # 2  b  2      3
 # 3  c  5      7


df2$newCol <- apply(Vectorize(function(x) x>=df2$c2)(df3$c1),1, function(i) min(df3$c1[i]))

如果df1存储在变量

中

x <- "df1"

apply(Vectorize(function(y) y>= get(x)$c2)(df3$c1), 1, function(i) min(df3$c1[i]))
#[1] 1 3 7

更新

assign(x, `[[<-`(get(x), 'c3', value=apply(Vectorize(function(y) y>= get(x)$c2)(df3$c1), 1, function(i) min(df3$c1[i]))))
get(x)
#  c1 c2 c3
# 1  a  1  1
# 2  b  2  3
# 3  c  5  7

循环遍历数据帧列表并根据R中的另一个数据帧添加新列

1 个答案:

更新