比较。条件满足时重置序列的有效方法(R)

时间:2015-06-26 21:01:34

标签: r data-manipulation

问题:

如果符合条件(主题更改),我想重置(1,2)序列 我有forif循环可以做到这一点,但不出所料,这种方法非常慢。 任何建议(例如,涉及申请家庭)是否有更有效的方法?

电流:

  subj odd_even
    a         
    a         
    a         
    b         
    b         
    b         
    b         
    c         
    c         
    c         

目标:

  subj odd_even
    a      1   
    a      2   
    a      1   
    b      1   
    b      2   
    b      1   
    b      2   
    c      1   
    c      2   
    c      1   
df = data.frame( subj = c("a","a","a","b","b","b","b", "c","c","c"), odd_even = "" )

3 个答案:

答案 0 :(得分:6)

我喜欢这个sequence函数:

df$odd_even <- 2L - sequence(table(df$subj)) %% 2L

data.table是另一种选择:

library(data.table)
setDT(df)
df[, odd_evenDT := 2L - seq_along(.I) %% 2L, by = subj]

<强>基准:

set.seed(42)
df <- data.frame(subj = sort(sample(as.character(1:1e4), 1e5, TRUE)))
DT <- data.table(df)

library(microbenchmark)
microbenchmark(roland1 = 2L - sequence(table(df$subj)) %% 2L,
           roland2 = DT[,2L - seq_along(.I) %% 2L, by = subj],
           roland3 = 2L - sequence(rle(as.integer(df$subj))$lengths) %% 2L,
           jeremy = df %>% group_by(subj) %>%
             mutate(odd_even = 2 - (row_number() %% 2)),
           frank = 2L - ave(as.integer(df$s),df$s,FUN=seq_along) %% 2L, 
           flick = ave(seq_along(df$subj), df$subj, FUN=function(x) rep(c(1,2), length.out=length(x))),
           times = 10, unit = "relative")

# Unit: relative
#     expr      min       lq      mean   median        uq      max neval
#  roland1 5.820459 5.754497 5.0368686 5.404110 4.0853039 4.847161    10
#  roland2 1.110919 1.057952 0.9840653 1.037428 0.7939004 1.176258    10
#  roland3 1.000000 1.000000 1.0000000 1.000000 1.0000000 1.000000    10
#   jeremy 5.024087 4.941366 4.3491117 4.635534 3.5144515 4.277011    10
#    frank 2.036816 1.944603 1.7809168 1.831937 1.6459597 1.607283    10
#    flick 3.655127 3.621457 3.2453089 3.473188 2.7717947 3.198285    10

答案 1 :(得分:5)

这是另一种笨重的做法:

df$odd_even <- 2L - ave(as.integer(df$s),df$s,FUN=seq_along) %% 2L

ave在每个组内制作一个计数器。那个反击是我们对奇数和偶数的测试。

答案 2 :(得分:2)

如果(define (highest L k) (if (= k 0) '() (cons (highesthelper (car L) L) (highest (remove (highesthelper (car L) L) L) (- k 1))))) (define (remove E L) (cond ((null? L) '()) ((= E (car L)) (cdr L)) (else (cons (car L) (remove E (cdr L)))))) (define (highesthelper Hi L) (cond ((null? L) Hi) ((> Hi (car L)) (highesthelper Hi (cdr L))) (else (highesthelper (car L) (cdr L))))) (highest '(1 7 4 5 3) 2) 稍后在数据框中重新出现,那么期望的行为是什么?

如果它不会发生,请使用 Query query = entityManager .createNativeQuery("SELECT anno_id, a.user_id FROM Annotation AS a" + " LEFT JOIN group_membership g ON g.user_id = ?" + " WHERE a.user_id = ?" + " AND (a.access_control='PUBLIC'" + " OR (a.access_control='GROUP' AND a.group_id = g.group_id)" + " OR (a.access_control='PRIVATE' AND g.user_id = a.user_id))" + " GROUP BY a.anno_id"); query.setParameter(1, new Long(1)); query.setParameter(2, new Long(1)); List<Object[]> list = query.getResultList(); return list; 方法:

subj