将Excel SUMIFS公式应用于R

时间:2018-07-24 05:42:18

标签: r sumifs

我有df1和df2,在Excel中使用> df1 <- read.csv("C:/Users/model/df1.csv") > df1 YEAR BAG_1 BAG_2 ITEMS 31-Dec-12 1 1 1230438.453 31-Dec-12 1 2 24327.087 31-Dec-12 1 3 8962.611 31-Dec-12 1 4 3841.119 31-Dec-12 1 5 12803.73 31-Dec-12 2 1 12670.095 31-Dec-12 2 2 342.435 31-Dec-12 2 3 296.777 31-Dec-12 2 4 136.974 31-Dec-12 2 5 9382.719 31-Dec-12 3 1 4493.741 31-Dec-12 3 2 214.718 31-Dec-12 3 3 184.044 31-Dec-12 3 4 92.022 31-Dec-12 3 5 10352.475 31-Dec-12 4 1 1517.586 31-Dec-12 4 2 160.242 31-Dec-12 4 3 122.538 31-Dec-12 4 4 18.852 31-Dec-12 4 5 7606.782 31-Dec-12 5 1 0 31-Dec-12 5 2 0 31-Dec-12 5 3 0 31-Dec-12 5 4 0 31-Dec-12 5 5 17084 31-Dec-13 1 2 16215.914 31-Dec-13 1 3 8731.646 31-Dec-13 1 4 7484.268 31-Dec-13 1 5 42410.852 31-Dec-13 2 1 15279.943 31-Dec-13 2 2 442.096 31-Dec-13 2 3 303.941 31-Dec-13 2 4 138.155 31-Dec-13 2 5 11466.865 31-Dec-13 3 1 4801.223 31-Dec-13 3 2 217.477 31-Dec-13 3 3 150.561 31-Dec-13 3 4 16.729 31-Dec-13 3 5 11543.01 31-Dec-13 4 1 2289.504 31-Dec-13 4 2 177.164 31-Dec-13 4 3 149.908 31-Dec-13 4 4 40.884 31-Dec-13 4 5 10970.54 31-Dec-13 5 1 0 31-Dec-13 5 2 0 31-Dec-13 5 3 0 31-Dec-13 5 4 0 31-Dec-13 5 5 21952 31-Dec-14 1 1 1160393.766 31-Dec-14 1 2 15829.086 31-Dec-14 1 3 8523.354 31-Dec-14 1 4 4870.488 31-Dec-14 1 5 28005.306 31-Dec-14 2 1 15095.349 31-Dec-14 2 2 461.808 31-Dec-14 2 3 202.041 31-Dec-14 2 4 144.315 31-Dec-14 2 5 12959.487 31-Dec-14 3 1 5331.848 31-Dec-14 3 2 324.234 31-Dec-14 3 3 162.117 31-Dec-14 3 4 108.078 31-Dec-14 3 5 12086.723 31-Dec-14 4 1 1810.35 31-Dec-14 4 2 174.33 31-Dec-14 4 3 120.69 31-Dec-14 4 4 13.41 31-Dec-14 4 5 11291.22 31-Dec-14 5 1 0 31-Dec-14 5 2 0 31-Dec-14 5 3 0 31-Dec-14 5 4 0 31-Dec-14 5 5 24210 31-Dec-15 1 1 1195886.146 31-Dec-15 1 2 17642.156 31-Dec-15 1 3 10081.232 31-Dec-15 1 4 6300.77 31-Dec-15 1 5 30243.696 31-Dec-15 2 1 15146.97 31-Dec-15 2 2 419.916 31-Dec-15 2 3 209.958 31-Dec-15 2 4 59.988 31-Dec-15 2 5 14157.168 31-Dec-15 3 1 4893.72 31-Dec-15 3 2 266.645 31-Dec-15 3 3 172.535 31-Dec-15 3 4 31.37 31-Dec-15 3 5 10320.73 31-Dec-15 4 1 1722.034 31-Dec-15 4 2 169.778 31-Dec-15 4 3 109.143 31-Dec-15 4 4 72.762 31-Dec-15 4 5 10053.283 31-Dec-15 5 1 0 31-Dec-15 5 2 0 31-Dec-15 5 3 0 31-Dec-15 5 4 0 31-Dec-15 5 5 23566 31-Dec-16 1 1 1160252.431 31-Dec-16 1 2 27241.786 31-Dec-16 1 3 16097.419 31-Dec-16 1 4 12382.63 31-Dec-16 1 5 23526.997 31-Dec-16 2 1 16477.812 31-Dec-16 2 2 2917.278 31-Dec-16 2 3 1442.61 31-Dec-16 2 4 1250.262 31-Dec-16 2 5 10002.096 31-Dec-16 3 1 5474.862 31-Dec-16 3 2 815.028 31-Dec-16 3 3 921.336 31-Dec-16 3 4 637.848 31-Dec-16 3 5 9851.208 31-Dec-16 4 1 2300.886 31-Dec-16 4 2 383.481 31-Dec-16 4 3 326.669 31-Dec-16 4 4 426.09 31-Dec-16 4 5 10765.874 31-Dec-16 5 1 0 31-Dec-16 5 2 0 31-Dec-16 5 3 0 31-Dec-16 5 4 0 31-Dec-16 5 5 30662 公式进行项目选择过程中,我需要将此公式转换为R代码。

> df2 <- read.csv("C:/Users/model/df2.csv")
> df2

CurrentYEAR     BAG_1   BAG_2

16-Dec      1   1
16-Dec      1   2
16-Dec      1   3
16-Dec      1   4
16-Dec      1   5
16-Dec      2   1
16-Dec      2   2
16-Dec      2   3
16-Dec      2   4
16-Dec      2   5
16-Dec      3   1
16-Dec      3   2
16-Dec      3   3
16-Dec      3   4
16-Dec      3   5
16-Dec      4   1
16-Dec      4   2
16-Dec      4   3
16-Dec      4   4
16-Dec      4   5
16-Dec      5   1
16-Dec      5   2
16-Dec      5   3
16-Dec      5   4
16-Dec      5   5

数据帧2-df2

IN EXCEL - FOR DF1 - 
COLUMN A - SR.NO 
COLUMN B - YEAR 

我有一个公式:

BAG_1 - COLUMN IN EXCEL - C
BAG_2 - D
ITEMS - E

请忽略A和B列,公式以C列开头:

BAG_1 - COLUMN IN EXCEL - I
BAG_2 - J
SELECTED_ITEMS - K
REJECTED_ITEMS - L

在EXCEL中-对于DF2-

=SUMIFS($E$2:$E$126,C2:$C$126,I2,$D$2:$D$126,J2)

在SELECTED_ITEMS下,对于单元格1-SELECTED_ITEMS列下方-应用以下公式:

=SUMIFS($E$2:$E$126,$C$2:$C$126,I2)

在REJECTED_ITEMS下,对于单元格1-在REJECTED_ITEMS列下方-应用以下公式:

c_YEAR  BAG_1    BAG_2     SELECTED_ITEMS   REJECTED_ITEMS
16-Dec  1       1           5919506.116     6245028.263
16-Dec  1       2           101256.029      6245028.263
16-Dec  1       3           52396.262       6245028.263
16-Dec  1       4           34879.275       6245028.263
16-Dec  1       5           136990.581      6245028.263
16-Dec  2       1           74670.169       141407.058
16-Dec  2       2           4583.533        141407.058
16-Dec  2       3           2455.327        141407.058
16-Dec  2       4           1729.694        141407.058
16-Dec  2       5           57968.335       141407.058
16-Dec  3       1           24995.394       83464.282
16-Dec  3       2           1838.102        83464.282
16-Dec  3       3           1590.593        83464.282
16-Dec  3       4           886.047         83464.282
16-Dec  3       5           54154.146       83464.282
16-Dec  4       1           9640.36         62794
16-Dec  4       2           1064.995        62794
16-Dec  4       3           828.948         62794
16-Dec  4       4           571.998         62794
16-Dec  4       5           50687.699       62794
16-Dec  5       1           0               117474
16-Dec  5       2           0               117474
16-Dec  5       3           0               117474
16-Dec  5       4           0               117474
16-Dec  5       5           117474          117474

预期输出:

dput(df1)

structure(list(YEAR = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("31-Dec-12", 
"31-Dec-13", "31-Dec-14", "31-Dec-15", "31-Dec-16"), class = "factor"), 
    BAG_1 = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 
    3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 
    1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 
    4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
    2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 
    5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 
    3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 
    1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 
    4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
    2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 
    5L, 5L, 5L), BAG_1.1 = c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 
    4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 
    4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 
    4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 
    4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 
    4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 
    4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 
    4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 
    4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 
    4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 
    4L, 5L, 1L, 2L, 3L, 4L, 5L), ITEMS = c(1230438.453, 24327.087, 
    8962.611, 3841.119, 12803.73, 12670.095, 342.435, 296.777, 
    136.974, 9382.719, 4493.741, 214.718, 184.044, 92.022, 10352.475, 
    1517.586, 160.242, 122.538, 18.852, 7606.782, 0, 0, 0, 0, 
    17084, 1172535.32, 16215.914, 8731.646, 7484.268, 42410.852, 
    15279.943, 442.096, 303.941, 138.155, 11466.865, 4801.223, 
    217.477, 150.561, 16.729, 11543.01, 2289.504, 177.164, 149.908, 
    40.884, 10970.54, 0, 0, 0, 0, 21952, 1160393.766, 15829.086, 
    8523.354, 4870.488, 28005.306, 15095.349, 461.808, 202.041, 
    144.315, 12959.487, 5331.848, 324.234, 162.117, 108.078, 
    12086.723, 1810.35, 174.33, 120.69, 13.41, 11291.22, 0, 0, 
    0, 0, 24210, 1195886.146, 17642.156, 10081.232, 6300.77, 
    30243.696, 15146.97, 419.916, 209.958, 59.988, 14157.168, 
    4893.72, 266.645, 172.535, 31.37, 10320.73, 1722.034, 169.778, 
    109.143, 72.762, 10053.283, 0, 0, 0, 0, 23566, 1160162.06, 
    26814.89979, 15906.05385, 11952.4074, 23427.57938, 16469.63194, 
    2907.825884, 1448.99787, 1243.254659, 9988.289648, 5478.0866, 
    814.7068434, 929.9067234, 635.8147221, 9859.485111, 2305.868381, 
    376.5420847, 323.6473401, 426.5827964, 10770.3594, 0, 0, 
    0, 0, 30662, 1867369, 33736, 23, 49, 222, 20995, 8103, 14034, 
    58, 168, 3076, 958, 1584, 10014, 186, 1169, 636, 255, 869, 
    8022, 1252, 467, 119, 219, 451798)), .Names = c("YEAR", "BAG_1", 
"BAG_1.1", "ITEMS"), class = "data.frame", row.names = c(NA, 
-150L))


    dput(df2)

structure(list(Current.YEAR = structure(c(1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L), .Label = "Dec-16", class = "factor"), BAG_1 = c(1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 
4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L), BAG_2 = c(1L, 2L, 3L, 4L, 5L, 
1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 
2L, 3L, 4L, 5L)), .Names = c("Current.YEAR", "BAG_1", "BAG_2"
), class = "data.frame", row.names = c(NA, -25L))

请帮助我根据预期输出将此公式写入r代码。

结构:

{{1}}

1 个答案:

答案 0 :(得分:3)

我想我有你想要的东西

# read in your data and convert to data.table
library(data.table)
df1 <- data.table(...) # all that stuff you have above
names(df1) <- tolower(names(df1))
dt1 <- data.table(df1)

# calculate column K of your Excel
result_1 <- dt1[ , .(selected_items = sum(items)), by=.(bag_1, bag_2)]

# calculate column L of your Excel
result_2 <- dt1[ , .(rejected_items = sum(items)), by=.(bag_1)]

# put results next to each other to match your example
result <- merge(result_1, result_2, by="bag_1", all=TRUE)