我正在创建一个R sweave文件,该文件将编译一个软件测试数据的pdf报告。数据主要来自SQL服务器表,如下所示:
Comment.where(type: 'MagazineComment').all
我使用TestNum列可以更容易地逐步计算版本,因为它们是字符串。所以在我的R脚本中,我有一个应该找到最新版本的部分和之前的部分。
| FileName | Version | Category | Value | Date | TestNum |
|:--------:|:-------:|:--------:|:-----:|:-------------------:|:-------:|
| File1 | 1.0.12 | Run Time | 74 | 2016-10-01 12:00:00 | 1 |
| File1 | 1.0.12 | Totals | 468 | 2016-10-01 12:00:00 | 1 |
| File1 | 1.0.12 | DB Size | 589 | 2016-10-01 12:00:00 | 1 |
| File2 | 1.0.12 | Run Time | 81 | 2016-10-01 12:00:00 | 1 |
| File2 | 1.0.12 | Totals | 351 | 2016-10-01 12:00:00 | 1 |
| File2 | 1.0.12 | DB Size | 625 | 2016-10-01 12:00:00 | 1 |
| File1 | 1.0.15 | Run Time | 74 | 2016-10-01 12:00:00 | 2 |
| File1 | 1.0.15 | Totals | 468 | 2016-10-01 12:00:00 | 2 |
| File1 | 1.0.15 | DB Size | 589 | 2016-10-01 12:00:00 | 2 |
| File2 | 1.0.15 | Run Time | 81 | 2016-10-01 12:00:00 | 2 |
| File2 | 1.0.15 | Totals | 351 | 2016-10-01 12:00:00 | 2 |
| File2 | 1.0.15 | DB Size | 625 | 2016-10-01 12:00:00 | 2 |
| File1 | 1.0.17 | Run Time | 74 | 2016-10-01 12:00:00 | 3 |
| File1 | 1.0.17 | Totals | 468 | 2016-10-01 12:00:00 | 3 |
| File1 | 1.0.17 | DB Size | 589 | 2016-10-01 12:00:00 | 3 |
| File2 | 1.0.17 | Run Time | 81 | 2016-10-01 12:00:00 | 3 |
| File2 | 1.0.17 | Totals | 351 | 2016-10-01 12:00:00 | 3 |
| File2 | 1.0.17 | DB Size | 625 | 2016-10-01 12:00:00 | 3 |
| File1 | 1.0.21 | Run Time | 74 | 2016-10-01 12:00:00 | 4 |
| File1 | 1.0.21 | Totals | 468 | 2016-10-01 12:00:00 | 4 |
| File1 | 1.0.21 | DB Size | 589 | 2016-10-01 12:00:00 | 4 |
| File2 | 1.0.21 | Run Time | 81 | 2016-10-01 12:00:00 | 4 |
| File2 | 1.0.21 | Totals | 351 | 2016-10-01 12:00:00 | 4 |
| File2 | 1.0.21 | DB Size | 625 | 2016-10-01 12:00:00 | 4 |
然而,有时候一个版本的软件非常错误并且每次测试都会崩溃。这在图表中看起来非常有用,所以我只是在SQL数据库中添加一行我用来过滤掉它,然后R数据帧最终看起来像这样:
vLatest <- unique(df[df[,"TestNum"] == max(df$TestNum), "Version"])
vPrevious <- unique(df[df[,"TestNum"] == max(df$TestNum)-1, "Version"])
但是| FileName | Version | Category | Value | Date | TestNum |
|:--------:|:-------:|:--------:|:-----:|:-------------------:|:-------:|
| File1 | 1.0.12 | Run Time | 74 | 2016-10-01 12:00:00 | 1 |
| File1 | 1.0.12 | Totals | 468 | 2016-10-01 12:00:00 | 1 |
| File1 | 1.0.12 | DB Size | 589 | 2016-10-01 12:00:00 | 1 |
| File2 | 1.0.12 | Run Time | 81 | 2016-10-01 12:00:00 | 1 |
| File2 | 1.0.12 | Totals | 351 | 2016-10-01 12:00:00 | 1 |
| File2 | 1.0.12 | DB Size | 625 | 2016-10-01 12:00:00 | 1 |
| File1 | 1.0.15 | Run Time | 74 | 2016-10-01 12:00:00 | 2 |
| File1 | 1.0.15 | Totals | 468 | 2016-10-01 12:00:00 | 2 |
| File1 | 1.0.15 | DB Size | 589 | 2016-10-01 12:00:00 | 2 |
| File2 | 1.0.15 | Run Time | 81 | 2016-10-01 12:00:00 | 2 |
| File2 | 1.0.15 | Totals | 351 | 2016-10-01 12:00:00 | 2 |
| File2 | 1.0.15 | DB Size | 625 | 2016-10-01 12:00:00 | 2 |
| File1 | 1.0.21 | Run Time | 74 | 2016-10-01 12:00:00 | 4 |
| File1 | 1.0.21 | Totals | 468 | 2016-10-01 12:00:00 | 4 |
| File1 | 1.0.21 | DB Size | 589 | 2016-10-01 12:00:00 | 4 |
| File2 | 1.0.21 | Run Time | 81 | 2016-10-01 12:00:00 | 4 |
| File2 | 1.0.21 | Totals | 351 | 2016-10-01 12:00:00 | 4 |
| File2 | 1.0.21 | DB Size | 625 | 2016-10-01 12:00:00 | 4 |
仍在寻找vPrevious
,因此脚本会中断。有没有办法可以查找第二个最高值呢?
编辑:根据建议,这是我用来创建数据框的查询内容。
TestNum == 3
答案 0 :(得分:2)
您可以尝试将dense_rank
与order by TestNum
下面的代码段给出了它的用法示例。
select c.*
from (
select *,dense_rank() over (order by [object_id] desc) as [row_number]
from sys.columns
) c
where c.[row_number] in (1,2)
如果您可以将Sql查询添加到问题中,那么它可能有助于提供更有针对性的响应。
修改强>
为op的原始查询量身定制;
select FileName, Version, Category, Value, Date, TestNum
from (
select FileName, Version, Category, Value, Date, TestNum
, dense_rank() over (order by [TestNum] desc) as [row_number]
from Table
where Comments != 'Do Not Include in R Chart'
) t
where t.[row_number] in (1,2)