我存储了以下数据框:
Source: local data frame [18 x 3]
Groups: instance [?]
instance V2 wtime
(fctr) (fctr) (dbl)
1 CCRG10 BranchDBMS 2.1845122
2 CCRG10 CacheDBMS 0.8619093
3 CCRG20 BranchDBMS 7.3522605
4 CCRG20 CacheDBMS 2.5523066
5 CCRG30 BranchDBMS 15.7318869
6 CCRG30 CacheDBMS 5.1411876
7 CCRG40 BranchDBMS 31.7315724
8 CCRG40 CacheDBMS 7.6714212
9 CCRG50 BranchDBMS 58.0909133
10 CCRG50 CacheDBMS 11.3979914
11 CCRG60 BranchDBMS 78.5095645
12 CCRG60 CacheDBMS 15.5988044
13 CCRG70 BranchDBMS 94.0637485
14 CCRG70 CacheDBMS 20.2977642
15 CCRG80 BranchDBMS 102.8716548
16 CCRG80 CacheDBMS 25.0142898
17 CCRG90 BranchDBMS 100.5247555
18 CCRG90 CacheDBMS 28.3753977
我想将此表转换为新表,例如
Source: local data frame [9 x 2]
Groups: instance [?]
instance speedup
(fctr) (dbl)
1 CCRG10 2.5345035
...
对于每个实例,我想将BranchDBMS
的wtime除以CacheDBMS
,此处为2.18 / 0.86 = 2.53。
如何自动完成此过程?
答案 0 :(得分:2)
通过查看发布的输出,您似乎可以在dplyr
内管理您的表格,因此tidyr
方法将是一种自然选择。
Vectorize(require)(package = c("dplyr", "magrittr", "tidyr"),
character.only = TRUE)
dta %<>%
spread(key = V3, value = V4) %>%
mutate(wtimRes = BranchDBMS / CacheDBMS) %>%
rename(instance = V2)
> head(dta, 5)
instance BranchDBMS CacheDBMS wtimRes
1 CCRG10 2.184512 0.8619093 2.534504
2 CCRG20 7.352260 2.5523066 2.880634
3 CCRG30 15.731887 5.1411876 3.059971
4 CCRG40 31.731572 7.6714212 4.136336
5 CCRG50 58.090913 11.3979914 5.096592
当然,如果需要,您可能希望 gather
将您的搜索结果放入一列。
dta %<>%
gather(key = key, value = value, -instance)
会产生:
> head(dta,6)
instance key value
1 CCRG10 BranchDBMS 2.184512
2 CCRG20 BranchDBMS 7.352260
3 CCRG30 BranchDBMS 15.731887
4 CCRG40 BranchDBMS 31.731572
5 CCRG50 BranchDBMS 58.090913
6 CCRG60 BranchDBMS 78.509564
dtaTxt <- " instance V2 wtime
(fctr) (fctr) (dbl)
1 CCRG10 BranchDBMS 2.1845122
2 CCRG10 CacheDBMS 0.8619093
3 CCRG20 BranchDBMS 7.3522605
4 CCRG20 CacheDBMS 2.5523066
5 CCRG30 BranchDBMS 15.7318869
6 CCRG30 CacheDBMS 5.1411876
7 CCRG40 BranchDBMS 31.7315724
8 CCRG40 CacheDBMS 7.6714212
9 CCRG50 BranchDBMS 58.0909133
10 CCRG50 CacheDBMS 11.3979914
11 CCRG60 BranchDBMS 78.5095645
12 CCRG60 CacheDBMS 15.5988044
13 CCRG70 BranchDBMS 94.0637485
14 CCRG70 CacheDBMS 20.2977642
15 CCRG80 BranchDBMS 102.8716548
16 CCRG80 CacheDBMS 25.0142898
17 CCRG90 BranchDBMS 100.5247555
18 CCRG90 CacheDBMS 28.3753977"
dta <- read.table(textConnection(dtaTxt), header = FALSE,
colClasses=c("NULL", NA, NA, NA), skip = 2)