我有df1和df2,在Excel中使用> df1 <- read.csv("C:/Users/model/df1.csv")
> df1
YEAR BAG_1 BAG_2 ITEMS
31-Dec-12 1 1 1230438.453
31-Dec-12 1 2 24327.087
31-Dec-12 1 3 8962.611
31-Dec-12 1 4 3841.119
31-Dec-12 1 5 12803.73
31-Dec-12 2 1 12670.095
31-Dec-12 2 2 342.435
31-Dec-12 2 3 296.777
31-Dec-12 2 4 136.974
31-Dec-12 2 5 9382.719
31-Dec-12 3 1 4493.741
31-Dec-12 3 2 214.718
31-Dec-12 3 3 184.044
31-Dec-12 3 4 92.022
31-Dec-12 3 5 10352.475
31-Dec-12 4 1 1517.586
31-Dec-12 4 2 160.242
31-Dec-12 4 3 122.538
31-Dec-12 4 4 18.852
31-Dec-12 4 5 7606.782
31-Dec-12 5 1 0
31-Dec-12 5 2 0
31-Dec-12 5 3 0
31-Dec-12 5 4 0
31-Dec-12 5 5 17084
31-Dec-13 1 2 16215.914
31-Dec-13 1 3 8731.646
31-Dec-13 1 4 7484.268
31-Dec-13 1 5 42410.852
31-Dec-13 2 1 15279.943
31-Dec-13 2 2 442.096
31-Dec-13 2 3 303.941
31-Dec-13 2 4 138.155
31-Dec-13 2 5 11466.865
31-Dec-13 3 1 4801.223
31-Dec-13 3 2 217.477
31-Dec-13 3 3 150.561
31-Dec-13 3 4 16.729
31-Dec-13 3 5 11543.01
31-Dec-13 4 1 2289.504
31-Dec-13 4 2 177.164
31-Dec-13 4 3 149.908
31-Dec-13 4 4 40.884
31-Dec-13 4 5 10970.54
31-Dec-13 5 1 0
31-Dec-13 5 2 0
31-Dec-13 5 3 0
31-Dec-13 5 4 0
31-Dec-13 5 5 21952
31-Dec-14 1 1 1160393.766
31-Dec-14 1 2 15829.086
31-Dec-14 1 3 8523.354
31-Dec-14 1 4 4870.488
31-Dec-14 1 5 28005.306
31-Dec-14 2 1 15095.349
31-Dec-14 2 2 461.808
31-Dec-14 2 3 202.041
31-Dec-14 2 4 144.315
31-Dec-14 2 5 12959.487
31-Dec-14 3 1 5331.848
31-Dec-14 3 2 324.234
31-Dec-14 3 3 162.117
31-Dec-14 3 4 108.078
31-Dec-14 3 5 12086.723
31-Dec-14 4 1 1810.35
31-Dec-14 4 2 174.33
31-Dec-14 4 3 120.69
31-Dec-14 4 4 13.41
31-Dec-14 4 5 11291.22
31-Dec-14 5 1 0
31-Dec-14 5 2 0
31-Dec-14 5 3 0
31-Dec-14 5 4 0
31-Dec-14 5 5 24210
31-Dec-15 1 1 1195886.146
31-Dec-15 1 2 17642.156
31-Dec-15 1 3 10081.232
31-Dec-15 1 4 6300.77
31-Dec-15 1 5 30243.696
31-Dec-15 2 1 15146.97
31-Dec-15 2 2 419.916
31-Dec-15 2 3 209.958
31-Dec-15 2 4 59.988
31-Dec-15 2 5 14157.168
31-Dec-15 3 1 4893.72
31-Dec-15 3 2 266.645
31-Dec-15 3 3 172.535
31-Dec-15 3 4 31.37
31-Dec-15 3 5 10320.73
31-Dec-15 4 1 1722.034
31-Dec-15 4 2 169.778
31-Dec-15 4 3 109.143
31-Dec-15 4 4 72.762
31-Dec-15 4 5 10053.283
31-Dec-15 5 1 0
31-Dec-15 5 2 0
31-Dec-15 5 3 0
31-Dec-15 5 4 0
31-Dec-15 5 5 23566
31-Dec-16 1 1 1160252.431
31-Dec-16 1 2 27241.786
31-Dec-16 1 3 16097.419
31-Dec-16 1 4 12382.63
31-Dec-16 1 5 23526.997
31-Dec-16 2 1 16477.812
31-Dec-16 2 2 2917.278
31-Dec-16 2 3 1442.61
31-Dec-16 2 4 1250.262
31-Dec-16 2 5 10002.096
31-Dec-16 3 1 5474.862
31-Dec-16 3 2 815.028
31-Dec-16 3 3 921.336
31-Dec-16 3 4 637.848
31-Dec-16 3 5 9851.208
31-Dec-16 4 1 2300.886
31-Dec-16 4 2 383.481
31-Dec-16 4 3 326.669
31-Dec-16 4 4 426.09
31-Dec-16 4 5 10765.874
31-Dec-16 5 1 0
31-Dec-16 5 2 0
31-Dec-16 5 3 0
31-Dec-16 5 4 0
31-Dec-16 5 5 30662
公式进行项目选择过程中,我需要将此公式转换为R代码。
> df2 <- read.csv("C:/Users/model/df2.csv")
> df2
CurrentYEAR BAG_1 BAG_2
16-Dec 1 1
16-Dec 1 2
16-Dec 1 3
16-Dec 1 4
16-Dec 1 5
16-Dec 2 1
16-Dec 2 2
16-Dec 2 3
16-Dec 2 4
16-Dec 2 5
16-Dec 3 1
16-Dec 3 2
16-Dec 3 3
16-Dec 3 4
16-Dec 3 5
16-Dec 4 1
16-Dec 4 2
16-Dec 4 3
16-Dec 4 4
16-Dec 4 5
16-Dec 5 1
16-Dec 5 2
16-Dec 5 3
16-Dec 5 4
16-Dec 5 5
数据帧2-df2
IN EXCEL - FOR DF1 -
COLUMN A - SR.NO
COLUMN B - YEAR
我有一个公式:
BAG_1 - COLUMN IN EXCEL - C
BAG_2 - D
ITEMS - E
请忽略A和B列,公式以C列开头:
BAG_1 - COLUMN IN EXCEL - I
BAG_2 - J
SELECTED_ITEMS - K
REJECTED_ITEMS - L
在EXCEL中-对于DF2-
=SUMIFS($E$2:$E$126,C2:$C$126,I2,$D$2:$D$126,J2)
在SELECTED_ITEMS下,对于单元格1-SELECTED_ITEMS列下方-应用以下公式:
=SUMIFS($E$2:$E$126,$C$2:$C$126,I2)
在REJECTED_ITEMS下,对于单元格1-在REJECTED_ITEMS列下方-应用以下公式:
c_YEAR BAG_1 BAG_2 SELECTED_ITEMS REJECTED_ITEMS
16-Dec 1 1 5919506.116 6245028.263
16-Dec 1 2 101256.029 6245028.263
16-Dec 1 3 52396.262 6245028.263
16-Dec 1 4 34879.275 6245028.263
16-Dec 1 5 136990.581 6245028.263
16-Dec 2 1 74670.169 141407.058
16-Dec 2 2 4583.533 141407.058
16-Dec 2 3 2455.327 141407.058
16-Dec 2 4 1729.694 141407.058
16-Dec 2 5 57968.335 141407.058
16-Dec 3 1 24995.394 83464.282
16-Dec 3 2 1838.102 83464.282
16-Dec 3 3 1590.593 83464.282
16-Dec 3 4 886.047 83464.282
16-Dec 3 5 54154.146 83464.282
16-Dec 4 1 9640.36 62794
16-Dec 4 2 1064.995 62794
16-Dec 4 3 828.948 62794
16-Dec 4 4 571.998 62794
16-Dec 4 5 50687.699 62794
16-Dec 5 1 0 117474
16-Dec 5 2 0 117474
16-Dec 5 3 0 117474
16-Dec 5 4 0 117474
16-Dec 5 5 117474 117474
预期输出:
dput(df1)
structure(list(YEAR = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("31-Dec-12",
"31-Dec-13", "31-Dec-14", "31-Dec-15", "31-Dec-16"), class = "factor"),
BAG_1 = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L,
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L,
4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L,
5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L,
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L,
4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L,
5L, 5L, 5L), BAG_1.1 = c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L,
4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L,
4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L,
4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L,
4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L,
4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L,
4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L,
4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L,
4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L,
4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L,
4L, 5L, 1L, 2L, 3L, 4L, 5L), ITEMS = c(1230438.453, 24327.087,
8962.611, 3841.119, 12803.73, 12670.095, 342.435, 296.777,
136.974, 9382.719, 4493.741, 214.718, 184.044, 92.022, 10352.475,
1517.586, 160.242, 122.538, 18.852, 7606.782, 0, 0, 0, 0,
17084, 1172535.32, 16215.914, 8731.646, 7484.268, 42410.852,
15279.943, 442.096, 303.941, 138.155, 11466.865, 4801.223,
217.477, 150.561, 16.729, 11543.01, 2289.504, 177.164, 149.908,
40.884, 10970.54, 0, 0, 0, 0, 21952, 1160393.766, 15829.086,
8523.354, 4870.488, 28005.306, 15095.349, 461.808, 202.041,
144.315, 12959.487, 5331.848, 324.234, 162.117, 108.078,
12086.723, 1810.35, 174.33, 120.69, 13.41, 11291.22, 0, 0,
0, 0, 24210, 1195886.146, 17642.156, 10081.232, 6300.77,
30243.696, 15146.97, 419.916, 209.958, 59.988, 14157.168,
4893.72, 266.645, 172.535, 31.37, 10320.73, 1722.034, 169.778,
109.143, 72.762, 10053.283, 0, 0, 0, 0, 23566, 1160162.06,
26814.89979, 15906.05385, 11952.4074, 23427.57938, 16469.63194,
2907.825884, 1448.99787, 1243.254659, 9988.289648, 5478.0866,
814.7068434, 929.9067234, 635.8147221, 9859.485111, 2305.868381,
376.5420847, 323.6473401, 426.5827964, 10770.3594, 0, 0,
0, 0, 30662, 1867369, 33736, 23, 49, 222, 20995, 8103, 14034,
58, 168, 3076, 958, 1584, 10014, 186, 1169, 636, 255, 869,
8022, 1252, 467, 119, 219, 451798)), .Names = c("YEAR", "BAG_1",
"BAG_1.1", "ITEMS"), class = "data.frame", row.names = c(NA,
-150L))
dput(df2)
structure(list(Current.YEAR = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L), .Label = "Dec-16", class = "factor"), BAG_1 = c(1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L,
4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L), BAG_2 = c(1L, 2L, 3L, 4L, 5L,
1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L,
2L, 3L, 4L, 5L)), .Names = c("Current.YEAR", "BAG_1", "BAG_2"
), class = "data.frame", row.names = c(NA, -25L))
请帮助我根据预期输出将此公式写入r代码。
结构:
{{1}}
答案 0 :(得分:3)
我想我有你想要的东西
# read in your data and convert to data.table
library(data.table)
df1 <- data.table(...) # all that stuff you have above
names(df1) <- tolower(names(df1))
dt1 <- data.table(df1)
# calculate column K of your Excel
result_1 <- dt1[ , .(selected_items = sum(items)), by=.(bag_1, bag_2)]
# calculate column L of your Excel
result_2 <- dt1[ , .(rejected_items = sum(items)), by=.(bag_1)]
# put results next to each other to match your example
result <- merge(result_1, result_2, by="bag_1", all=TRUE)