我想创建数据框的另一列,以根据顺序将第一列中的每个成员分组。
这是一个可复制的演示:
df1=c("Alex","23","ID #:123", "John","26","ID #:564")
df1=data.frame(df1)
library(dplyr)
library(data.table)
df1 %>% mutate(group= ifelse(df1 %like% "ID #:",1,NA ) )
这是演示的输出:
df1 group
1 Alex NA
2 23 NA
3 ID #:123 1
4 John NA
5 26 NA
6 ID #:564 1
这就是我想要的:
df1 group
1 Alex 1
2 23 1
3 ID #:123 1
4 John 2
5 26 2
6 ID #:564 2
所以我想在组列中按顺序指示每个成员。
对于任何答复或想法,我都表示感谢!
答案 0 :(得分:1)
先用lag
转移条件,然后再进行cumsum
:
df1 %>%
mutate(group= cumsum(lag(df1 %like% "ID #:", default = 1)))
# df1 group
#1 Alex 1
#2 23 1
#3 ID #:123 1
#4 John 2
#5 26 2
#6 ID #:564 2
详细信息:
df1 %>%
mutate(
# calculate the condition
cond = df1 %like% "ID #:",
# shift the condition down and fill the first value with 1
lag_cond = lag(cond, default = 1),
# increase the group when the condition is TRUE (ID encountered)
group= cumsum(lag_cond))
# df1 cond lag_cond group
#1 Alex FALSE TRUE 1
#2 23 FALSE FALSE 1
#3 ID #:123 TRUE FALSE 1
#4 John FALSE TRUE 2
#5 26 FALSE FALSE 2
#6 ID #:564 TRUE FALSE 2
答案 1 :(得分:1)
您没有提到您是否总是希望每个成员3行。此代码将允许您切换每个成员的行数(以防万一不一定是3):
# Your code:
df1=c("Alex","23","ID #:123", "John","26","ID #:564")
df1=data.frame(df1)
library(dplyr)
library(data.table)
df1 %>% mutate(group= ifelse(df1 %like% "ID #:",1,NA ) )
number_of_rows_per_member <- 3 # Change if necessary
positions <- 1:(nrow(df1)/number_of_rows_per_member)
group <- c()
for (i in 1:length(positions)) {
group[(i*number_of_rows_per_member):((i*number_of_rows_per_member)-(number_of_rows_per_member-1))] <- i
}
group # This is the group column
df1$group <- group # Now just move the group coloumn into your original dataframe
df1 # Done!