我还没有找到解决方案,我认为这应该很简单,但是现在我想不起来了。
我有两个数据框,每月流量平均值和每年流量平均值。我需要将年度平均值除以每月平均值。
ano mes dias Au_TPDM Bu_TPDM CU_TPDM CAI_TPDM CAII_TPDM TOTAL
1 2012 Ene 31 4288.323 620.5161 236.7419 4635.097 139.0645 6112.258
7 2012 Feb 29 3268.862 593.0000 246.3103 5191.069 147.9655 6267.286
13 2012 Mar 31 3667.903 624.7097 289.0323 5341.774 154.7419 6740.226
19 2012 Abr 30 4668.767 647.2333 281.2667 4930.433 158.3000 7236.300
25 2012 May 31 3198.581 598.9677 256.1290 5384.742 202.2581 6612.581
31 2012 Jun 30 3609.067 605.8667 280.3333 5309.500 178.7000 6795.000
anosDB TPDA_Au TPDA_Bu TPDA_CU TPDA_CAI TPDA_CAII TPDA_TOTAL
1 2012 4271.096 617.4809 255.1967 5119.454 163.5055 10426.73
2 2013 4685.079 638.5616 259.8877 5287.822 154.0110 11025.36
3 2014 4969.277 656.3918 266.8986 5407.800 177.0932 11477.46
4 2015 5184.953 541.8822 400.2137 4941.422 271.6877 11340.16
5 2016 5220.872 408.6967 541.0519 5584.492 182.4399 11937.55
6 2017 5298.852 408.7562 556.5644 6033.652 266.1644 12563.99
因此TPDM表的前12行应划分TPDA表的第一行,并创建一个新的数据框,其中应包含月度因子。 像这样:
ano mes dias FA_Au
2012 Ene 31 4271.096/4288.323
2012 Feb 29 4271.096/3268.862
(无需显示计算,仅显示结果) 我确信按年份选择数据会做到这一点,但还没有找到正确的方法。
答案 0 :(得分:0)
如zx8754所述,这可以通过合并年份并在基数R中划分相应的列来完成:
merged <- merge(TPDM, TPDA, by.x = "ano", by.y = "anosDB")
FA <- cbind(merged[, 1:3], merged[, 10:15]/merged[, 4:9])
# rename columns
names(FA) <- sub("TPDA_", "FA_", names(FA))
FA
ano mes dias FA_Au FA_Bu FA_CU FA_CAI FA_CAII FA_TOTAL 1 2012 Ene 31 0.9959828 0.9951086 1.0779532 1.1044977 1.1757530 1.705872 2 2012 Feb 29 1.3066003 1.0412831 1.0360781 0.9862042 1.1050245 1.663675 3 2012 Mar 31 1.1644517 0.9884285 0.8829349 0.9583809 1.0566337 1.546941 4 2012 Abr 30 0.9148231 0.9540314 0.9073122 1.0383376 1.0328838 1.440892 5 2012 May 31 1.3353096 1.0309085 0.9963600 0.9507334 0.8084003 1.576802 6 2012 Jun 30 1.1834349 1.0191696 0.9103332 0.9642064 0.9149720 1.534471
注意: 只要知道相应列的位置,即列号,该方法就起作用。对于给定的数据集,列以相同的方式排序。因此,只需考虑偏移量即可匹配相应的列。
如果由于某种原因事先不知道位置,我们可以通过匹配列名来找到对应的列。
为此,两个数据集都从宽格式重整为长格式。在长格式中,列名(现在称为variable
)被视为数据。现在,我们可以将和列名称上的月度和年度值连接起来,将年值除以相应的月度值,然后重新调整为宽格式,最后:
library(data.table)
# reshape and prepare monthly data
longM <- melt(setDT(TPDM), id.vars = 1:3)
longM[, variable := stringr::str_replace(variable, "_TPDM", "")]
longM[, mes := forcats::fct_inorder(mes)]
# reshape and prepare annual data
longA <- melt(setDT(TPDA), id.vars = 1)
longA[, variable := stringr::str_replace(variable, "TPDA_", "")]
setnames(longA, "anosDB", "ano")
# join
long_FA <- longA[longM, on = .(ano, variable),
.(ano, mes, dias, variable, FA = value/i.value)]
# reshape back to wide format
dcast(long_FA, ano + mes +dias ~ paste0("FA_", variable), value.var = "FA")
ano mes dias FA_Au FA_Bu FA_CAI FA_CAII FA_CU FA_TOTAL 1: 2012 Ene 31 0.9959828 0.9951086 1.1044977 1.1757530 1.0779532 1.705872 2: 2012 Feb 29 1.3066003 1.0412831 0.9862042 1.1050245 1.0360781 1.663675 3: 2012 Mar 31 1.1644517 0.9884285 0.9583809 1.0566337 0.8829349 1.546941 4: 2012 Abr 30 0.9148231 0.9540314 1.0383376 1.0328838 0.9073122 1.440892 5: 2012 May 31 1.3353096 1.0309085 0.9507334 0.8084003 0.9963600 1.576802 6: 2012 Jun 30 1.1834349 1.0191696 0.9642064 0.9149720 0.9103332 1.534471
TPDM <- read.table(text = "
i ano mes dias Au_TPDM Bu_TPDM CU_TPDM CAI_TPDM CAII_TPDM TOTAL
1 2012 Ene 31 4288.323 620.5161 236.7419 4635.097 139.0645 6112.258
7 2012 Feb 29 3268.862 593.0000 246.3103 5191.069 147.9655 6267.286
13 2012 Mar 31 3667.903 624.7097 289.0323 5341.774 154.7419 6740.226
19 2012 Abr 30 4668.767 647.2333 281.2667 4930.433 158.3000 7236.300
25 2012 May 31 3198.581 598.9677 256.1290 5384.742 202.2581 6612.581
31 2012 Jun 30 3609.067 605.8667 280.3333 5309.500 178.7000 6795.000
", header = TRUE)[, -1L]
TPDA <- read.table(text = "
i anosDB TPDA_Au TPDA_Bu TPDA_CU TPDA_CAI TPDA_CAII TPDA_TOTAL
1 2012 4271.096 617.4809 255.1967 5119.454 163.5055 10426.73
2 2013 4685.079 638.5616 259.8877 5287.822 154.0110 11025.36
3 2014 4969.277 656.3918 266.8986 5407.800 177.0932 11477.46
4 2015 5184.953 541.8822 400.2137 4941.422 271.6877 11340.16
5 2016 5220.872 408.6967 541.0519 5584.492 182.4399 11937.55
6 2017 5298.852 408.7562 556.5644 6033.652 266.1644 12563.99
", header = TRUE)[, -1L]