我有以下2个数据帧:
> bvg1
Parameters X18.Oct.14 X19.Oct.14 X20.Oct.14 X21.Oct.14 X22.Oct.14 X23.Oct.14 X24.Oct.14
1 24K Equivalent Plan 29.00 29.60 33.80 36.60 35.30 31.90 29.00
2 24K Equivalent Act 28.80 31.00 35.40 35.90 34.70 33.40 31.90
3 Plan Rep WS 2463.00 2513.00 2869.00 3115.00 2999.00 2714.00 2468.00
4 Act Rep WS 2447.00 2633.00 3013.00 3054.00 2953.00 2842.00 2714.00
5 Rep WS Var -16.00 120.00 144.00 -61.00 -46.00 128.00 246.00
6 Plan Rep Intakes 568.00 461.00 1159.00 1146.00 1126.00 1124.00 1106.00
7 Act Rep Intakes 707.00 494.00 1106.00 1096.00 1274.00 1087.00 1101.00
8 Rep Intakes Var 139.00 33.00 -53.00 -50.00 148.00 -37.00 -5.00
9 Plan Rep Comps_DL 468.00 54.00 836.00 1190.00 1327.00 1286.00 1108.00
10 Act Rep Comps_DL 471.00 70.00 995.00 1137.00 1323.00 1150.00 1073.00
11 Rep Comps Var_DL 3.00 16.00 159.00 -53.00 -4.00 -136.00 -35.00
12 Plan Rep Mandays_DL 148.00 19.00 260.00 368.00 412.00 398.00 345.00
13 Act Rep Mandays_DL 147.00 19.00 303.00 359.00 423.00 374.00 348.00
14 Rep Mandays Var_DL -1.00 1.00 43.00 -9.00 12.00 -24.00 3.00
15 Plan FVR Mandays_DL 0.00 0.00 4.00 18.00 18.00 18.00 18.00
16 Act FVR Mandays_DL 0.00 0.00 4.00 7.00 8.00 8.00 7.00
17 FVR Mandays Var_DL 0.00 0.00 0.00 -11.00 -10.00 -10.00 -11.00
18 Plan Rep Prod_DL 3.16 2.88 3.21 3.23 3.22 3.23 3.21
19 Act Rep Prod_DL 3.21 3.62 3.28 3.16 3.12 3.07 3.08
20 Rep Prod Var_DL 0.05 0.74 0.07 -0.07 -0.10 -0.16 -0.13
> bvg2
Parameters X18.Oct X19.Oct X20.Oct X21.Oct X22.Oct X23.Oct X24.Oct
1 24K Equivalent Plan 30.50 31.30 35.10 36.10 33.60 28.80 25.50
2 24K Equivalent Act 31.40 33.40 36.60 38.10 36.80 34.40 32.10
3 Plan Rep WS 3419.00 3509.00 3933.00 4041.00 3764.00 3220.00 2859.00
4 Act Rep WS 3514.00 3734.00 4098.00 4271.00 4122.00 3852.00 3591.00
5 Rep WS Var 95.00 225.00 165.00 230.00 358.00 632.00 732.00
6 Plan Rep Intakes 813.00 613.00 1559.00 1560.00 1506.00 1454.00 1410.00
7 Act Rep Intakes 964.00 602.00 1629.00 1532.00 1657.00 1507.00 1439.00
8 Rep Intakes Var 151.00 -11.00 70.00 -28.00 151.00 53.00 29.00
9 Plan Rep Comps_DL 675.00 175.00 1331.00 1732.00 1938.00 1706.00 1493.00
10 Act Rep Comps_DL 718.00 224.00 1389.00 1609.00 1848.00 1698.00 1537.00
11 Rep Comps Var_DL 43.00 49.00 58.00 -123.00 -90.00 -8.00 44.00
12 Plan Rep Mandays_DL 203.00 58.00 428.00 541.00 605.00 536.00 475.00
13 Act Rep Mandays_DL 215.00 63.00 472.00 542.00 608.00 556.00 523.00
14 Rep Mandays Var_DL 12.00 5.00 44.00 2.00 3.00 20.00 48.00
15 Plan FVR Mandays_DL 0.00 0.00 1.00 12.00 2.00 32.00 57.00
16 Act FVR Mandays_DL 0.00 0.00 2.00 2.00 5.00 5.00 5.00
17 FVR Mandays Var_DL 0.00 0.00 1.00 -10.00 3.00 -27.00 -52.00
18 Plan Rep Prod_DL 3.33 3.03 3.11 3.20 3.20 3.18 3.14
19 Act Rep Prod_DL 3.34 3.56 2.94 2.97 3.04 3.05 2.94
20 Rep Prod Var_DL 0.01 0.53 -0.17 -0.23 -0.16 -0.13 -0.20
这是一个时间序列数据,即24K等效计划于10月18日为29,10月19日为29.60,10月20日为33.80。第一个数据框有一个业务单位的数据,第二个数据框有不同业务单位的数据。
我想将数据帧合并为1,并希望分析差异,即它们的值不同。绘制ggplots,如2个直方图,显示差异,时间序列图等。
我尝试过以下方法: 我可以通过以下方式合并两个数据帧:
joined = rbind(bvg1, bvg2)
然而,我无法识别记录是否属于bvg1或bvg2 df。
如果我添加一个额外的列,即
bvg1$id = "bvg1"
bvg2$id = "bvg2"
然后merge命令不起作用并给出以下错误:
Error in match.names(clabs, names(xi)) :
names do not match previous names
任何示例代码都将受到高度赞赏。
答案 0 :(得分:3)
您可以通过剥离.
后跟bvg1
中的数字来匹配两个数据集的列名称。这可以使用regex
完成。在下面的代码中,使用lookbehind
正则表达式。它与后备(?<=[A-Za-])
匹配,即alphabet
后跟.
,后跟一个或多个元素.*
到字符串$
的末尾,并删除""
}}
colnames(bvg1) <-gsub("(?<=[A-Za-z])\\..*$", "", colnames(bvg1), perl=TRUE)
res <- rbind(bvg1, bvg2)
dim(res)
#[1] 40 9
head(res,3)
# Parameters X18.Oct X19.Oct X20.Oct X21.Oct X22.Oct X23.Oct X24.Oct
#1 24K Equivalent Plan 29.0 29.6 33.8 36.6 35.3 31.9 29.0
#2 24K Equivalent Act 28.8 31.0 35.4 35.9 34.7 33.4 31.9
#3 Plan Rep WS 2463.0 2513.0 2869.0 3115.0 2999.0 2714.0 2468.0
# id
#1 bvg1
#2 bvg1
#3 bvg1