您好我有一个包含“var”和“value”列的df,如果值出现>值列中的2次
,如何查找/计算按var分组的输出“列”var = c("A","A","A","A","A","B","B","B","B","B")
value = c(22,1,1,1,1,31,21,1,1,1)
df = data.frame(var, value)
output = c("non_rep","non_rep","non_rep","rep","rep","non_rep","non_rep","non_rep","rep")
预期产出:
var value output
A 21 non_rep
A 1 non_rep
A 1 non_rep
A 1 rep
A 1 rep
B 31 non_rep
B 21 non_rep
B 1 non_rep
B 1 non_rep
B 1 rep
提前致谢
答案 0 :(得分:5)
按两列分组,然后将前两个后面的所有值标记为//Check to see that you received request POST | GET
if( !empty($_REQUEST ))
{
// if($name != "" & $email != ""){ // This is wrong
//Use not empty as it check for "" | null | false
if( !empty( $name ) && !empty( $email ) )
{
$result = '<p>Your message has been sent!</p>';
$body = "From: $name\n E-mail: $emial\n Message:\n $message";
mail($to, $ownSubject, $message, $body);
header('Location: '.$_SERVER['PHP_SELF']);
}
var_dump( $result );
}
:
"rep"
df$output <- ifelse(ave(df$value, df[c("var","value")], FUN=seq_along) > 2, "rep", "non_rep")
# var value output
#1 A 22 non_rep
#2 A 1 non_rep
#3 A 1 non_rep
#4 A 1 rep
#5 A 1 rep
#6 B 31 non_rep
#7 B 21 non_rep
#8 B 1 non_rep
#9 B 1 non_rep
#10 B 1 rep
翻译可能是:
dplyr
答案 1 :(得分:3)
如果(var, value)
对可以多次出现并且需要被视为单独的组,则可以使用data.table
的{{1}}函数进行分组:
rleid
输出:
var = c("A","A","A","A","A","B","B","B","B","B", "A", "A", "A")
value =c(22,1,1,1,1,31,21,1,1,1, 22, 22, 22)
df = data.frame( var,value)
df$group = data.table::rleid(df$var, df$value)
df %>%
group_by(group) %>%
mutate(output = ifelse(row_number() > 2, "rep", "non_rep"))
答案 2 :(得分:2)
dplyr
解决方案似乎至少对您的示例数据起作用:
library(dplyr)
df %>%
group_by(var, value) %>%
mutate(output = ifelse(lag(value, n = 2) != value | is.na(lag(value, n = 2)),
"non_rep", "rep")) %>%
ungroup()
# A tibble: 10 x 3
var value output
<chr> <dbl> <chr>
1 A 22 non_rep
2 A 1 non_rep
3 A 1 non_rep
4 A 1 rep
5 A 1 rep
6 B 31 non_rep
7 B 21 non_rep
8 B 1 non_rep
9 B 1 non_rep
10 B 1 rep
答案 3 :(得分:1)
我们可以使用data.table
library(data.table)
setDT(df)[, output := if(.N > 2) rep(c("non_rep", "rep"),
c(2, .N-2)) else "non_rep" , .(var, value)]
df
# var value output
# 1: A 22 non_rep
# 2: A 1 non_rep
# 3: A 1 non_rep
# 4: A 1 rep
# 5: A 1 rep
# 6: B 31 non_rep
# 7: B 21 non_rep
# 8: B 1 non_rep
# 9: B 1 non_rep
#10: B 1 rep