确定每个级别是否单调递增

时间:2018-12-26 12:12:07

标签: r

我需要知道一个因素的每个水平是否提供增加的值。我见过How to check if a sequence of numbers is monotonically increasing (or decreasing)?,但不知道如何仅适用于单个级别。 假设有一个数据帧df,它被划分为多个人。每个人都有多年的身高。现在我想知道数据集是否正确。因此,我需要知道身高是否有所增加-每人:

我尝试了

Results<- by(df, df$person, 
                    function(x) {data = x,
                    all(x == cummax(height))
                    }
  )

但不起作用。还有

Results<- by(df, df$person, 
                    all(height == cummax(height))
                    }
  )

也没有。我收到找不到高度的信息。 我在这里做什么错了?

小数据提取:

Serial_number    Amplification    Voltage
1    608004648    111.997    379.980
2    608004648    123.673    381.968
3    608004648    137.701    383.979
4    608004648    154.514    385.973
5    608004648    175.331    387.980
6    608004648    201.379    389.968
7    608004649    118.753    378.080
8    608004649    131.739    380.085
9    608004649    147.294    382.082
10    608004649    166.238    384.077
11    608004649    189.841    386.074
12    608004649    220.072    388.073
13    608004650    115.474    382.066
14    608004650    127.838    384.063
15    608004650    142.602    386.064
16    608004650    160.452    388.056
17    608004650    182.732    390.060
18    608004650    211.035    392.065

Serial_number是影响因素,我想检查每个序列号是否相应的放大值在增加。

2 个答案:

答案 0 :(得分:2)

类似

vapply(unique(df$person), 
     function (k) all(diff(df$height[df$person == k]) >= 0), # or '> 0' if strictly mon. incr.
     logical(1))
# returns
[1]  TRUE FALSE FALSE

set.seed(123)
df <- data.frame(person = c("A","B", "C","A","A","C","B"), height = runif(7, 1.75, 1.85))
df
  person   height
1      A 1.778758
2      B 1.828831
3      C 1.790898
4      A 1.838302
5      A 1.844047
6      C 1.754556
7      B 1.802811

答案 1 :(得分:1)

我们可以按操作分组操作

library(dplyr)
df %>%
   group_by(Serial_number) %>%
   summarise(index = all(sign(Amplification - 
          lag(Amplification, default = first(Amplification))) >= 0))

或者使用by中的base R。当我们传递完整的数据集时,x(匿名函数调用对象)就是数据集,我们可以使用$[[

从中提取感兴趣的列
by(df, list(df$Serial_number), FUN = function(x) all(sign(diff(x$Amplification))>=0))

或使用data.table

library(data.table)
setDT(df)[, .(index = all(sign(Amplification - shift(Amplification, 
          fill = first(Amplification))) >=0)), .(Serial_number)]

数据

df <- structure(list(Serial_number = c(608004648L, 608004648L, 608004648L, 
608004648L, 608004648L, 608004648L, 608004649L, 608004649L, 608004649L, 
608004649L, 608004649L, 608004649L, 608004650L, 608004650L, 608004650L, 
608004650L, 608004650L, 608004650L), Amplification = c(111.997, 
123.673, 137.701, 154.514, 175.331, 201.379, 118.753, 131.739, 
147.294, 166.238, 189.841, 220.072, 115.474, 127.838, 142.602, 
160.452, 182.732, 211.035), Voltage = c(379.98, 381.968, 383.979, 
385.973, 387.98, 389.968, 378.08, 380.085, 382.082, 384.077, 
386.074, 388.073, 382.066, 384.063, 386.064, 388.056, 390.06, 
392.065)), class = "data.frame", row.names = c("1", "2", "3", 
"4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", 
"16", "17", "18"))