我想划分我的数据帧的每一列"数据"由另一个名为" benchmark"的数据帧的每一列。但是,我使用lapply并手动分割得到不同的结果。我的代码中的错误在哪里?
我使用的代码是:
div.A.B1 div.A.B2
1 0.7200000 0.8000000
2 0.7422680 0.8163265
3 0.7346939 0.8080808
4 0.7422680 0.8333333
5 0.7578947 0.8510638
6 0.7741935 0.8695652
7 0.7826087 0.8510638
8 0.7826087 0.8602151
9 0.7912088 0.8791209
10 0.8181818 0.8791209
对于前两列,这给了我以下结果......
A.B1 A.B2
1 0.7200000 0.7200000
2 0.8247423 0.8163265
3 0.7653061 0.7575758
4 0.7525773 0.7604167
5 0.9473684 0.9574468
6 0.8709677 0.8804348
7 0.8804348 0.8617021
8 0.9347826 0.9247312
9 1.0989011 1.0989011
10 0.9090909 0.8791209
...在划分第一列"数据"通过"基准"的前两列手动给我:
A
1 72
2 80
3 75
4 73
5 90
6 81
7 81
8 86
9 100
10 80
"数据"的一些示例数据:
B1 B2
1 100 100
2 97 98
3 98 99
4 97 96
5 95 94
6 93 92
7 92 94
8 92 93
9 91 91
10 88 91
和"基准":
{{1}}
答案 0 :(得分:2)
您可以使用outer
:
data <- read.table(text = " A1 A2
1 72 11
2 80 20
3 75 15
4 73 17
5 90 13
6 81 18
7 81 22
8 86 30
9 100 20
10 80 22", header = TRUE)
benchmark <- read.table(text = " B1 B2
1 100 100
2 97 98
3 98 99
4 97 96
5 95 94
6 93 92
7 92 94
8 92 93
9 91 91
10 88 91", header = TRUE)
res <- outer(seq_along(data), seq_along(benchmark),
function(i, j, DF1, DF2) DF1[,i] / DF2[, j],
DF1 = data, DF2 = benchmark)
names(res) <- outer(names(data), names(benchmark), paste, sep = ".")
# A1.B1 A2.B1 A1.B2 A2.B2
#1 0.7200000 0.1100000 0.7200000 0.1100000
#2 0.8247423 0.2061856 0.8163265 0.2040816
#3 0.7653061 0.1530612 0.7575758 0.1515152
#4 0.7525773 0.1752577 0.7604167 0.1770833
#5 0.9473684 0.1368421 0.9574468 0.1382979
#6 0.8709677 0.1935484 0.8804348 0.1956522
#7 0.8804348 0.2391304 0.8617021 0.2340426
#8 0.9347826 0.3260870 0.9247312 0.3225806
#9 1.0989011 0.2197802 1.0989011 0.2197802
#10 0.9090909 0.2500000 0.8791209 0.2417582
答案 1 :(得分:2)
如何使用df1/df2
,请参阅示例:
#dummy data
df1 <- mtcars[1:5, 1, drop = FALSE]
df2 <- mtcars[1:5, 4:6]
df1; df2
# mpg
# Mazda RX4 21.0
# Mazda RX4 Wag 21.0
# Datsun 710 22.8
# Hornet 4 Drive 21.4
# Hornet Sportabout 18.7
# hp drat wt
# Mazda RX4 110 3.90 2.620
# Mazda RX4 Wag 110 3.90 2.875
# Datsun 710 93 3.85 2.320
# Hornet 4 Drive 110 3.08 3.215
# Hornet Sportabout 175 3.15 3.440
df1$mpg/df2
# hp drat wt
# Mazda RX4 0.1909091 5.384615 8.015267
# Mazda RX4 Wag 0.1909091 5.384615 7.304348
# Datsun 710 0.2451613 5.922078 9.827586
# Hornet 4 Drive 0.1945455 6.948052 6.656299
# Hornet Sportabout 0.1068571 5.936508 5.436047
答案 2 :(得分:0)
我认为您可能想尝试使用purrr,它有一些功能可以让您映射多个列表,这对这种情况很有帮助。在这种情况下,您可以使用类似的东西
map2_df(data, benchmark, ~.x / .y)
答案 3 :(得分:0)
您可以尝试:
A=data; B=benchmark
matrix(apply(A, 2, function(x, y) apply(y, 2, function(z, x) x/z, x), B), nrow(A), ncol(A)*ncol(B), byrow = F)
[,1] [,2]
[1,] 0.7200000 0.7200000
[2,] 0.8247423 0.8163265
[3,] 0.7653061 0.7575758
[4,] 0.7525773 0.7604167
[5,] 0.9473684 0.9574468
[6,] 0.8709677 0.8804348
[7,] 0.8804348 0.8617021
[8,] 0.9347826 0.9247312
[9,] 1.0989011 1.0989011
[10,] 0.9090909 0.8791209
背后的想法是两个嵌套的应用函数。使用matrix()
函数适当地转换结果。
或者使用Rolands数据。请注意订购时间为A1B1, A1B2, A2B1, A2B2
matrix(apply(data, 2, function(x,y) apply(y, 2, function(z,x) x/z, x), benchmark), nrow(data) , ncol(data)*ncol(benchmark), byrow = F)
[,1] [,2] [,3] [,4]
[1,] 0.7200000 0.7200000 0.1100000 0.1100000
[2,] 0.8247423 0.8163265 0.2061856 0.2040816
[3,] 0.7653061 0.7575758 0.1530612 0.1515152
[4,] 0.7525773 0.7604167 0.1752577 0.1770833
[5,] 0.9473684 0.9574468 0.1368421 0.1382979
[6,] 0.8709677 0.8804348 0.1935484 0.1956522
[7,] 0.8804348 0.8617021 0.2391304 0.2340426
[8,] 0.9347826 0.9247312 0.3260870 0.3225806
[9,] 1.0989011 1.0989011 0.2197802 0.2197802
[10,] 0.9090909 0.8791209 0.2500000 0.2417582
或者结合zx8754的答案会给出一个可以与do.call
绑定在一起的分区列表:
do.call("cbind", apply(data, 2, function(x,y) x/y, benchmark))
答案 4 :(得分:0)
以下是使用expand.grid
的解决方案:
e <- do.call(expand.grid, list(1:ncol(data),1:ncol(benchmark)))
# e will give you all possible permutations of columns on which you can apply division
# Var1 Var2
# 1 1 1
# 2 2 1
# 3 1 2
# 4 2 2
r <- apply(e, 1, function(x) data[,x[1]]/benchmark[,x[2]])
# to make descriptive column names for r
colnames(r) <- apply(expand.grid(names(data), names(benchmark)), 1, paste, collapse="/")
# A1/B1 A2/B1 A1/B2 A2/B2
# [1,] 0.7200000 0.1100000 0.7200000 0.1100000
# [2,] 0.8247423 0.2061856 0.8163265 0.2040816
# [3,] 0.7653061 0.1530612 0.7575758 0.1515152
# [4,] 0.7525773 0.1752577 0.7604167 0.1770833
# [5,] 0.9473684 0.1368421 0.9574468 0.1382979
# [6,] 0.8709677 0.1935484 0.8804348 0.1956522
# [7,] 0.8804348 0.2391304 0.8617021 0.2340426
# [8,] 0.9347826 0.3260870 0.9247312 0.3225806
# [9,] 1.0989011 0.2197802 1.0989011 0.2197802
# [10,] 0.9090909 0.2500000 0.8791209 0.2417582
数据强>
data <- structure(list(A1 = c(72L, 80L, 75L, 73L, 90L, 81L, 81L, 86L,
100L, 80L), A2 = c(11L, 20L, 15L, 17L, 13L, 18L, 22L, 30L, 20L,
22L)), .Names = c("A1", "A2"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))
benchmark <- structure(list(B1 = c(100L, 97L, 98L, 97L, 95L, 93L, 92L, 92L,
91L, 88L), B2 = c(100L, 98L, 99L, 96L, 94L, 92L, 94L, 93L, 91L,
91L)), .Names = c("B1", "B2"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))