我的源数据的数据相当于几个月,但在这些数据中,我只想比较预先指定月份的数据。
这是我的输入数据:
dput(mydf)
structure(list(Month = structure(c(1L, 2L, 1L, 2L, 3L, 1L, 2L,
2L, 1L, 2L, 1L), .Label = c("Aug", "Oct", "Sep"), class = "factor"),
Pipe = c(3, 4, 5, 3, 2, 1, 3, 3, 4, NA, 5), Gp = structure(c(1L,
1L, 2L, 2L, 2L, 3L, 4L, 5L, 5L, 6L, 6L), .Label = c("A",
"B", "C", "D", "E", "F"), class = "factor")), .Names = c("Month",
"Pipe", "Gp"), row.names = c(NA, -11L), class = "data.frame")
现在,在这三个月中,我只想比较以下变量指定的月份。
This_month_to_compare <- "Oct"
Last_Month_to_compare <- "Aug"
现在,对于给定的two months
以及基于分组Gp
,我想说明Pipe
中的This_month_to_compare
值是否大于Last month to compare
中的pipe
值。如果两个structure(list(Month = structure(c(1L, 2L, 1L, 2L, 3L, 1L, 2L,
2L, 1L, 2L, 1L), .Label = c("Aug", "Oct", "Sep"), class = "factor"),
Pipe = c(3, 4, 5, 3, 2, 1, 3, 3, 4, NA, 5), Gp = structure(c(1L,
1L, 2L, 2L, 2L, 3L, 4L, 5L, 5L, 6L, 6L), .Label = c("A",
"B", "C", "D", "E", "F"), class = "factor"), Greater = c(NA,
TRUE, NA, FALSE, NA, NA, NA, FALSE, NA, NA, NA)), .Names = c("Month",
"Pipe", "Gp", "Greater"), row.names = c(NA, -11L), class = "data.frame")
Month Pipe Gp Greater Explanation
Aug 3 A Ignore: Aug
Oct 4 A TRUE 4 > 3
Aug 5 B Ignore: Aug
Oct 3 B FALSE 3< 5
Sep 2 B Ignore: Sep
Aug 1 C Ignore: Aug
Oct 3 D There is nothing to compare with
Oct 3 E FALSE 3<4
Aug 4 E Ignore: Aug
Oct F Cannot compare NA with 5
Aug 5 F Ignore: Aug
值中的一个不存在,我们将其留空。
这是输出的样子(手动创建,因为我没有成功使用代码)
mydfi<-data.table::as.data.table(mydfi)
mydf<-mydfi
#Method 1: Convert to Wide Format
#Convert to wide format
mydf<-data.table::dcast(mydf,Gp ~ Month, value.var = "Pipe")
#Compare
mydf$Growth<-mydf[[This_month_to_compare]]>mydf[[Last_Month_to_compare]]
#Back to long format
Melt_columns<-c("Aug","Oct","Sep")
mydf<-data.table::melt(mydf, measure.vars =Melt_columns,variable.name = "Month", value.name = "Pipe")
mydfo<-mydf[mydfi,on=c("Month","Gp","Pipe")]
mydfo[Month!=This_month_to_compare,"Growth"]<-NA
我手动添加了上述说明。
我确实尝试过编码,这是我的尝试:
odoo.define('Modulename.filename', function (require) {
"use strict";
var form_widget = require('web.form_widgets');
var core = require('web.core');
var _t = core._t;
var QWeb = core.qweb;
form_widget.WidgetButton.include({
on_click: function() {
if(this.node.attrs.custom === "click"){
//code//
}
this._super();
},
});
});
更新:我只需添加左连接即可解决上述问题。我已经更新了上面的代码。但是,我正在寻找这些方面的解决方案:Calculate difference between values in consecutive rows by group
原因是我的实际数据集很大,不允许连接。
非常感谢任何帮助。提前谢谢。
答案 0 :(得分:1)
这是你在想什么?
onRendered
如果需要,您可以简化代码以避免上述两个> library(data.table)
> mydf <- data.table(mydf)
> This_month_to_compare <- "Oct"
> Last_Month_to_compare <- "Aug"
> setkey(mydf, Gp, Month)
>
> # Make dummy table to join with
> mydf[
+ , Pipe_this := .SD[Month == This_month_to_compare, Pipe], by = "Gp"][
+ , Pipe_last := .SD[Month == Last_Month_to_compare, Pipe], by = "Gp"][
+ , `:=`(
+ Greater = Pipe_last < Pipe_this, Pipe_last = NULL, Pipe_this = NULL)][
+ Month != "Oct", Greater := NA]
> mydf
Month Pipe Gp Greater
1: Aug 3 A NA
2: Oct 4 A TRUE
3: Aug 5 B NA
4: Oct 3 B FALSE
5: Sep 2 B NA
6: Aug 1 C NA
7: Oct 3 D NA
8: Aug 4 E NA
9: Oct 3 E FALSE
10: Aug 5 F NA
11: Oct NA F NA
来电,并避免定义[.data.table
和Pipe_this
。
答案 1 :(得分:1)
这可以通过两个连接来实现。第一个过滤掉要比较的月份,并根据需要对它们进行排序。然后可以进行比较。第二个连接将结果附加到原始数据框。
library(data.table)
# Last_Month_to_compare, This_month_to_compare
months_to_compare <- c("Aug", "Oct")
mDT <- setDT(mydf)[
# append row id column (to preserve original order)
, rn := .I][
# cross join of groups and months
CJ(Gp = Gp, Month = months_to_compare, unique = TRUE), on = .(Gp, Month)][
# groupwise comparison of the two months
, Greater := Pipe > shift(Pipe), by = Gp][]
# appending result to original data frame by joining with intermediate result
mydf[mDT, on = .(rn), Greater := i.Greater][]
Month Pipe Gp rn Greater 1: Aug 3 A 1 NA 2: Oct 4 A 2 TRUE 3: Aug 5 B 3 NA 4: Oct 3 B 4 FALSE 5: Sep 2 B 5 NA 6: Aug 1 C 6 NA 7: Oct 3 D 7 NA 8: Oct 3 E 8 FALSE 9: Aug 4 E 9 NA 10: Oct NA F 10 NA 11: Aug 5 F 11 NA
请注意保留mydf
的原始顺序。
中间结果mDT
看起来像
Month Pipe Gp rn Greater 1: Aug 3 A 1 NA 2: Oct 4 A 2 TRUE 3: Aug 5 B 3 NA 4: Oct 3 B 4 FALSE 5: Aug 1 C 6 NA 6: Oct NA C NA NA 7: Aug NA D NA NA 8: Oct 3 D 7 NA 9: Aug 4 E 9 NA 10: Oct 3 E 8 FALSE 11: Aug 5 F 11 NA 12: Oct NA F 10 NA
OP要求解释mydf[mDT, on = .(rn)]
和mydf[mDT, on = .(rn), Greater := i.Greater][]
之间的区别。
使用data.table
,X[Y, on = ...]
是右外连接,相当于merge(X, Y, all.y = TRUE)
,即返回Y
的所有行(见JOINing data in R using data.table)。所以,
mydf[mDT, on = .(rn)]
返回
Month Pipe Gp rn i.Month i.Pipe i.Gp Greater 1: Aug 3 A 1 Aug 3 A NA 2: Oct 4 A 2 Oct 4 A TRUE 3: Aug 5 B 3 Aug 5 B NA 4: Oct 3 B 4 Oct 3 B FALSE 5: Aug 1 C 6 Aug 1 C NA 6: NA NA NA NA Oct NA C NA 7: NA NA NA NA Aug NA D NA 8: Oct 3 D 7 Oct 3 D NA 9: Aug 4 E 9 Aug 4 E NA 10: Oct 3 E 8 Oct 3 E FALSE 11: Aug 5 F 11 Aug 5 F NA 12: Oct NA F 10 Oct NA F NA
i.
前缀的列来自mDT
。请注意,第6行和第7行在mydf
中没有匹配的行。此外,行的顺序由mDT
中的顺序确定。
如果mydf
和mDT
互换,
mDT[mydf, on = .(rn)][]
返回
Month Pipe Gp rn Greater i.Month i.Pipe i.Gp 1: Aug 3 A 1 NA Aug 3 A 2: Oct 4 A 2 TRUE Oct 4 A 3: Aug 5 B 3 NA Aug 5 B 4: Oct 3 B 4 FALSE Oct 3 B 5: NA NA NA 5 NA Sep 2 B 6: Aug 1 C 6 NA Aug 1 C 7: Oct 3 D 7 NA Oct 3 D 8: Oct 3 E 8 FALSE Oct 3 E 9: Aug 4 E 9 NA Aug 4 E 10: Oct NA F 10 NA Oct NA F 11: Aug 5 F 11 NA Aug 5 F
i.
前缀的列现在来自mydf
。请注意,第5行在mDT
中没有匹配项。此外,行的顺序由mydf
确定。
使用赋值运算符:=
,X[Y, on = ..., a := b]
成为左内连接,其中包含原始顺序中的所有X
行。因此,
mydf[mDT, on = .(rn), Greater := i.Greater][]
返回
Month Pipe Gp rn Greater 1: Aug 3 A 1 NA 2: Oct 4 A 2 TRUE 3: Aug 5 B 3 NA 4: Oct 3 B 4 FALSE 5: Sep 2 B 5 NA 6: Aug 1 C 6 NA 7: Oct 3 D 7 NA 8: Oct 3 E 8 FALSE 9: Aug 4 E 9 NA 10: Oct NA F 10 NA 11: Aug 5 F 11 NA
其中Greater
对于不匹配的行变为NA
。