我有一个由地点和小时组成的数据框df
。
### dummy data
set.seed(1)
location <- c("loc1", "loc2", "loc3")
locations <- sample(location, size=50, replace=TRUE)
hours <- runif(50, min=0, max=20)
df <- data.frame(locations, hours)
我可以将数据切割成小时块并创建每个块的频率表
### cut data and create frequency table
c <- cut(df$hours, breaks=seq(0,20, by=1), include.lowest=TRUE)
t <- data.frame(table(c))
head(t)
c Freq
1 [0,1] 0
2 (1,2] 4
3 (2,3] 2
4 (3,4] 0
5 (4,5] 4
6 (5,6] 2
但是,我无法首先按位置对数据进行分组。
如何使用locations
变量对数据进行分组以提供类似
location c Freq
1 loc1 [0,1] x1
2 loc1 (1,2] x2
3 loc1 (2,3] x3
4 loc1 (3,4] x4
5 loc1 (4,5] x5
6 loc1 (5,6] x6
...
loc2 [0,1] y1
loc2 (1,2] y2
...
答案 0 :(得分:2)
你可以尝试:
library(dplyr)
df %>%
mutate(hours = cut(hours, breaks=seq(0,20, by=1), include.lowest=TRUE)) %>%
table() %>% data.frame() %>% arrange(locations, hours)
给出了:
# locations hours Freq
#1 loc1 [0,1] 0
#2 loc1 (1,2] 1
#3 loc1 (2,3] 1
#4 loc1 (3,4] 0
#5 loc1 (4,5] 0
#6 loc1 (5,6] 1
答案 1 :(得分:1)
t <- data.frame(table(df$locations, c))
head(t[order(t$Var1), ])
Var1 c Freq
1 loc1 [0,1] 0
4 loc1 (1,2] 1
7 loc1 (2,3] 1
10 loc1 (3,4] 0
13 loc1 (4,5] 0
16 loc1 (5,6] 1
首先是或cbind
。如果您打算稍后处理数据,可能会更安全。