我有这个数据框:
dput(DF)
structure(list(Metrics = c("db1.Tablespace_Space_Used_(%)", "db1.Tablespace_Space_Used_(%)",
"db1.Tablespace_Space_Used_(%)", "db1.Tablespace_Space_Used_(%)",
"db2.Tablespace_Space_Used_(%)", "db2.Tablespace_Space_Used_(%)",
"db1.Tablespace_Space_Used_(%)", "db2.Tablespace_Space_Used_(%)",
"db1.Tablespace_Space_Used_(%)", "db2.Tablespace_Space_Used_(%)",
"db2.Tablespace_Space_Used_(%)", "db2.Tablespace_Space_Used_(%)",
"db2.Tablespace_Space_Used_(%)", "db1.Tablespace_Space_Used_(%)",
"db1.Tablespace_Space_Used_(%)", "db1.Tablespace_Space_Used_(%)",
"db1.Tablespace_Space_Used_(%)", "db1.Tablespace_Space_Used_(%)",
"db1.Tablespace_Space_Used_(%)", "db1.Tablespace_Space_Used_(%)"
), Date = c(1416257563.98707, 1416257563.98707, 1416257563.98707,
1416257563.98707, 1416257563.98707, 1416257563.98707, 1416257563.98707,
1416257563.98707, 1416257563.98707, 1416257563.98707, 1416257563.98707,
1416257563.98707, 1416257563.98707, 1416257563.98707, 1416257563.98707,
1416257563.98707, 1416257563.98707, 1416257563.98707, 1416257563.98707,
1416257563.98707), Value = c(0, 0.02, 0.01, 0, 0.01, 0.01, 0.07,
0, 2.02, 0, 0, 9.32, 0.02, 9.27, 0, 12.72, 12.72, 12.72, 0.08,
12.72), Type1 = c("type=rac_database", "type=rac_database", "type=rac_database",
"type=rac_database", "type=rac_database", "type=rac_database",
"type=rac_database", "type=rac_database", "type=rac_database",
"type=rac_database", "type=rac_database", "type=rac_database",
"type=rac_database", "type=rac_database", "type=rac_database",
"type=rac_database", "type=rac_database", "type=rac_database",
"type=rac_database", "type=rac_database")), .Names = c("Metrics",
"Date", "Value", "Type1"), class = "data.frame", row.names = c(10092L,
10097L, 10103L, 10104L, 10107L, 10108L, 10111L, 10112L, 10114L,
10115L, 10116L, 10117L, 10118L, 10120L, 10121L, 10188L, 10189L,
10190L, 10192L, 10216L))
这是更大数据框的子集。正如您所看到的相同的度量标准和日期,有多个不同的值。我希望能够只选择相同数据和指标类型的最大值。因此,对于相同的日期和指标,我应该只有一个应该是最大点的值。任何想法,我怎么能把这个df分配?
例如,对于度量标准:db1.Tablespace_Space_Used _(%)和日期:1416257564
在我的df中,我应该有一个条目:
db1.Tablespace_Space_Used_(%) 1416257564 12.72 type=rac_database
答案 0 :(得分:1)
此处给出的答案相同Finding maximum value of one column (by group) and inserting value into another data frame in R
假设您的数据框名为df
df_1 <- aggregate(Value ~ Metrics + Date + Type1, df, max)
df_1
#edit: removed 'cbind'
输出
Metrics Date Type1 Value
1 db1.Tablespace_Space_Used_(%) 1416257564 type=rac_database 12.72
2 db2.Tablespace_Space_Used_(%) 1416257564 type=rac_database 9.32
答案 1 :(得分:0)
这个怎么样:
> # find the maximum for Value for each combination of Metrics and Date
> df2 <- aggregate(df$Value, by=list(Metrics=df$Metrics, Date=df$Date), max)
> colnames(df2)[3] <- "Value"
> # add the corresponding value for Type1
> df2$Type1 <- df[df$Metrics == df2$Metrics & df$Date == df2$Date & df$Value == df2$Value, "Type1"]
> # result
> df2
Metrics Date Value Type1
1 db1.Tablespace_Space_Used_(%) 1416257564 12.72 type=rac_database
2 db2.Tablespace_Space_Used_(%) 1416257564 9.32 type=rac_database