我需要知道一个因素的每个水平是否提供增加的值。我见过How to check if a sequence of numbers is monotonically increasing (or decreasing)?,但不知道如何仅适用于单个级别。 假设有一个数据帧df,它被划分为多个人。每个人都有多年的身高。现在我想知道数据集是否正确。因此,我需要知道身高是否有所增加-每人:
我尝试了
Results<- by(df, df$person,
function(x) {data = x,
all(x == cummax(height))
}
)
但不起作用。还有
Results<- by(df, df$person,
all(height == cummax(height))
}
)
也没有。我收到找不到高度的信息。 我在这里做什么错了?
小数据提取:
Serial_number Amplification Voltage
1 608004648 111.997 379.980
2 608004648 123.673 381.968
3 608004648 137.701 383.979
4 608004648 154.514 385.973
5 608004648 175.331 387.980
6 608004648 201.379 389.968
7 608004649 118.753 378.080
8 608004649 131.739 380.085
9 608004649 147.294 382.082
10 608004649 166.238 384.077
11 608004649 189.841 386.074
12 608004649 220.072 388.073
13 608004650 115.474 382.066
14 608004650 127.838 384.063
15 608004650 142.602 386.064
16 608004650 160.452 388.056
17 608004650 182.732 390.060
18 608004650 211.035 392.065
Serial_number
是影响因素,我想检查每个序列号是否相应的放大值在增加。
答案 0 :(得分:2)
类似
vapply(unique(df$person),
function (k) all(diff(df$height[df$person == k]) >= 0), # or '> 0' if strictly mon. incr.
logical(1))
# returns
[1] TRUE FALSE FALSE
与
set.seed(123)
df <- data.frame(person = c("A","B", "C","A","A","C","B"), height = runif(7, 1.75, 1.85))
df
person height
1 A 1.778758
2 B 1.828831
3 C 1.790898
4 A 1.838302
5 A 1.844047
6 C 1.754556
7 B 1.802811
答案 1 :(得分:1)
我们可以按操作分组操作
library(dplyr)
df %>%
group_by(Serial_number) %>%
summarise(index = all(sign(Amplification -
lag(Amplification, default = first(Amplification))) >= 0))
或者使用by
中的base R
。当我们传递完整的数据集时,x
(匿名函数调用对象)就是数据集,我们可以使用$
或[[
by(df, list(df$Serial_number), FUN = function(x) all(sign(diff(x$Amplification))>=0))
或使用data.table
library(data.table)
setDT(df)[, .(index = all(sign(Amplification - shift(Amplification,
fill = first(Amplification))) >=0)), .(Serial_number)]
df <- structure(list(Serial_number = c(608004648L, 608004648L, 608004648L,
608004648L, 608004648L, 608004648L, 608004649L, 608004649L, 608004649L,
608004649L, 608004649L, 608004649L, 608004650L, 608004650L, 608004650L,
608004650L, 608004650L, 608004650L), Amplification = c(111.997,
123.673, 137.701, 154.514, 175.331, 201.379, 118.753, 131.739,
147.294, 166.238, 189.841, 220.072, 115.474, 127.838, 142.602,
160.452, 182.732, 211.035), Voltage = c(379.98, 381.968, 383.979,
385.973, 387.98, 389.968, 378.08, 380.085, 382.082, 384.077,
386.074, 388.073, 382.066, 384.063, 386.064, 388.056, 390.06,
392.065)), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15",
"16", "17", "18"))