我有一个这样的数据框:
df = data.frame(main_name = c("google","yahoo","google","amazon","yahoo","google"),
volume = c(32,43,412,45,12,54))
我想将其排序为main_name,例如
目的是要知道从哪一行开始有特定的短语,直到将哪一个短语用于for循环中。
main_name volume
amazon 45
google 32
google 412
google 54
yahoo 43
yahoo 12
在其中不需要任何“自动”即可知道特定短语。只是要检查它是否已更改并知道开始和结束行号?
amazon [1]
google [2:4]
yahoo [5:6]
答案 0 :(得分:1)
使用tidyverse
:
df%>%
arrange(main_name)%>%
mutate(row=row_number())%>%
group_by(main_name)%>%
summarise(start=first(row),
end=last(row))%>%
mutate(res=glue::glue("[{start}:{end}]"))
# A tibble: 3 x 4
main_name start end res
<fct> <int> <int> <chr>
1 amazon 1 1 [1:1]
2 google 2 4 [2:4]
3 yahoo 5 6 [5:6]
答案 1 :(得分:1)
这是使用rle
with(rle(as.character(df$main_name)), setNames(mapply(
function(x, y) sprintf("[%s:%s]", x, y),
cumsum(lengths) - lengths + 1, cumsum(lengths)), values))
# amazon google yahoo
#"[1:1]" "[2:4]" "[5:6]"
df <- read.table(text =
"main_name volume
amazon 45
google 32
google 412
google 54
yahoo 43
yahoo 12", header = T)
答案 2 :(得分:1)
这是另一个base R
选项
with(df, tapply(seq_along(main_name), main_name, FUN =
function(x) do.call(sprintf, c(fmt = "[%d:%d]", as.list(range(x))))))
# amazon google yahoo
# "[1:1]" "[2:4]" "[5:6]"