我仍然是R编程世界的初学者,请不要介意基本问题。 我有一个文件中的数据,如下所示。
grep "lcost" inflection_point.trc
AP: lcost=4.00, rcost=6.02
AP: lcost=74340.93, rcost=249.97
AP: lcost=37172.17, rcost=128.50
AP: lcost=18587.79, rcost=6.24
AP: lcost=9295.60, rcost=6.13
AP: lcost=4649.71, rcost=6.08
AP: lcost=2326.56, rcost=6.05
AP: lcost=1165.19, rcost=6.04
AP: lcost=584.30, rcost=6.03
AP: lcost=294.06, rcost=6.03
AP: lcost=148.94, rcost=6.02
.....
grep "inflection point at card" inflection_point.trc
AP: Costing Nested Loops Join for inflection point at card 1.35
AP: Costing Hash Join for inflection point at card 1.35
AP: Costing Nested Loops Join for inflection point at card 182361.04
AP: Costing Hash Join for inflection point at card 182361.04
AP: Costing Nested Loops Join for inflection point at card 91181.20
AP: Costing Hash Join for inflection point at card 91181.20
AP: Costing Nested Loops Join for inflection point at card 45591.27
AP: Costing Hash Join for inflection point at card 45591.27
AP: Costing Nested Loops Join for inflection point at card 22796.31
AP: Costing Hash Join for inflection point at card 22796.31
AP: Costing Nested Loops Join for inflection point at card 11398.83
AP: Costing Hash Join for inflection point at card 11398.83
.....
要求是使用R编程表示lcost和rcost值的绘图线图,x轴值来自"拐点"。
我尝试使用grep创建数据框但是徒劳无功,也不知道如何将这些值加载到数据框中并绘制lcost和rcost的线图以及x轴值。
> dataframe <- grep ('lcost',readLines("inflection_point.trc"),value=TRUE)
[1] "AP: lcost=4.00, rcost=6.02" "AP: lcost=74340.93, rcost=249.97"
[3] "AP: lcost=37172.17, rcost=128.50" "AP: lcost=18587.79, rcost=6.24"
[5] "AP: lcost=9295.60, rcost=6.13" "AP: lcost=4649.71, rcost=6.08"
[7] "AP: lcost=2326.56, rcost=6.05" "AP: lcost=1165.19, rcost=6.04"
[9] "AP: lcost=584.30, rcost=6.03" "AP: lcost=294.06, rcost=6.03"
[11] "AP: lcost=148.94, rcost=6.02" "AP: lcost=75.97, rcost=6.02"
[13] "AP: lcost=39.69, rcost=6.02" "AP: lcost=21.75, rcost=6.02"
[15] "AP: lcost=12.78, rcost=6.02" "AP: lcost=7.89, rcost=6.02"
[17] "AP: lcost=5.85, rcost=6.02" "AP: lcost=7.08, rcost=6.02"
[19] "AP: lcost=6.26, rcost=6.02" "AP: lcost=6.26, rcost=6.02"
任何帮助对我来说都是很好的学习R
这是我能想到的,有人可以通过使用ggplot帮助我绘制线图。与我的派生方式相比,有没有简单的方法来计算数据?有没有办法将Dataframe中的所有列数据类型转换为Double?
lines <- readLines("inflection_point.trc")
require(reshape2)
fd1 <- colsplit(string=gsub( "[A-z]+[[:punct:]]", "", grep("cost=[0-9]+", lines, value=TRUE)),pattern=",", names=c("HASH", "NESTED"))
fd1
HASH NESTED
1 4.00 6.02
2 74340.93 249.97
3 37172.17 128.50
4 18587.79 6.24
5 9295.60 6.13
6 4649.71 6.08
7 2326.56 6.05
8 1165.19 6.04
9 584.30 6.03
10 294.06 6.03
11 148.94 6.02
12 75.97 6.02
13 39.69 6.02
14 21.75 6.02
15 12.78 6.02
16 7.89 6.02
17 5.85 6.02
18 7.08 6.02
19 6.26 6.02
20 6.26 6.02
fd2 <- data.frame(Card= unique(gsub( "([[:alpha:]]|\\s|:)", "", grep(".*inflection point at card", lines, value=TRUE))))
fd2
Card
1 1.35
2 182361.04
3 91181.20
4 45591.27
5 22796.31
6 11398.83
7 5700.09
8 2850.72
9 1426.04
10 713.69
11 357.52
12 179.44
13 90.39
14 45.87
15 23.61
16 12.48
17 6.92
18 9.70
19 8.31
20 7.61
require(dplyr)
fd3 <- bind_cols(fd1,fd2)
fd3
Source: local data frame [20 x 3]
HASH NESTED Card
(dbl) (dbl) (fctr)
1 4.00 6.02 1.35
2 74340.93 249.97 182361.04
3 37172.17 128.50 91181.20
4 18587.79 6.24 45591.27
5 9295.60 6.13 22796.31
6 4649.71 6.08 11398.83
7 2326.56 6.05 5700.09
8 1165.19 6.04 2850.72
9 584.30 6.03 1426.04
10 294.06 6.03 713.69
11 148.94 6.02 357.52
12 75.97 6.02 179.44
13 39.69 6.02 90.39
14 21.75 6.02 45.87
15 12.78 6.02 23.61
16 7.89 6.02 12.48
17 5.85 6.02 6.92
18 7.08 6.02 9.70
19 6.26 6.02 8.31
20 6.26 6.02 7.61
fd3 <- fd3[-1,]
fd3
Source: local data frame [19 x 3]
HASH NESTED Card
(dbl) (dbl) (fctr)
1 74340.93 249.97 182361.04
2 37172.17 128.50 91181.20
3 18587.79 6.24 45591.27
4 9295.60 6.13 22796.31
5 4649.71 6.08 11398.83
6 2326.56 6.05 5700.09
7 1165.19 6.04 2850.72
8 584.30 6.03 1426.04
9 294.06 6.03 713.69
10 148.94 6.02 357.52
11 75.97 6.02 179.44
12 39.69 6.02 90.39
13 21.75 6.02 45.87
14 12.78 6.02 23.61
15 7.89 6.02 12.48
16 5.85 6.02 6.92
17 7.08 6.02 9.70
18 6.26 6.02 8.31
19 6.26 6.02 7.61
> is.data.frame(fd3)
[1] TRUE