如果列中存在一定数量的连续NA,则替换这些值

时间:2018-02-09 00:58:57

标签: r na

我有一个名为meanSR_strong的列,另一个叫meanSR_weak。如果meanSR_strong列中有10个或更多连续的NA,我希望将值替换为meanSR_weak列中的值,即使这些替换值也是NA。如果meanSR_strong列中有连续的NA,那么我就不需要进行任何替换。

例如,第3-6行都是NA,但这只是连续四次,所以它并不重要。但是,行15-28都是NA(并且连续超过10个),所以我想从meanSR_weak列中获取值。

我知道如何更换所有的NA,但我还没有想出一个很好的编码方式!

这是我的数据

x=structure(list(meanSR_strong = c(NA, 0.376009009009009, NA, NA, 
NA, NA, 0.615585585585586, NA, 0.607354054054054, 0.590210810810811, 
0.57005045045045, 0.596616216216216, 0.584066666666667, 0.538597297297297, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0.639010810810811, 
0.634272972972973), meanSR_weak = c(0.574724324324324, 0.562030630630631, 
0.586247747747748, NA, NA, NA, 0.615585585585586, NA, 0.607354054054054, 
0.590210810810811, 0.57005045045045, 0.596616216216216, 0.608510810810811, 
0.538597297297297, NA, NA, NA, 0.555463063063063, 0.376715315315315, 
NA, NA, NA, NA, NA, NA, 0.60972972972973, NA, NA, 0.639010810810811, 
0.634272972972973), cloud.pct_strong = c(100, 36.036036036036, 
98.1981981981982, 100, 100, 100, 0, 100, 0, 0, 0, 0, 3.6036036036036, 
0, NA, NA, 100, 67.5675675675676, 100, 100, NA, 100, 100, 100, 
100, 74.7747747747748, 100, 100, 0, 0), cloud.pct_weak = c(0, 
0, 0, 100, 100, 100, 0, 100, 0, 0, 0, 0, 0, 0, NA, NA, 100, 0, 
36.036036036036, 67.5675675675676, NA, 100, 100, 100, 100, 0.900900900900901, 
100, 60.3603603603604, 0, 0), date = structure(c(951868800, 951955200, 
952041600, 952128000, 952214400, 952300800, 952387200, 952473600, 
952560000, 952646400, 952732800, 952819200, 952905600, 952992000, 
953078400, 953164800, 953251200, 953337600, 953424000, 953510400, 
953596800, 953683200, 953769600, 953856000, 953942400, 954028800, 
954115200, 954201600, 954288000, 954374400), class = c("POSIXct", 
"POSIXt"), tzone = "UTC")), .Names = c("meanSR_strong", "meanSR_weak", 
"cloud.pct_strong", "cloud.pct_weak", "date"), row.names = c(NA, 
-30L), class = c("tbl_df", "tbl", "data.frame"))

2 个答案:

答案 0 :(得分:4)

import random
cards = ['A ', '2 ', '3 ', '4 ', '5 ', '6 ', '7 ', '8 ', '9 ', '10 ', 'K ', 'Q ', 'J ']

def poker_hand(x):
    hand = []
    for i in range(0, 5):
        pick = random.choice(cards)
        hand.append(pick)
    return hand

def poker_game(num_players):
    for i in range(2, 10): 
        print("Player {}:".format(i), "".join(poker_hand(cards)))
poker_game(cards)

答案 1 :(得分:4)

R rle功能可用于此目的。首先建立一个rle-list("值"和"长度",见?rleis.na - 值:

z <- rle(is.na(x$meanSR_strong))

当NA的运行小于您选择的某个长度时,将z $ values条目从TRUE更改为FALSE。我在这里选择10:

z$values[z$lengths <10& z$values==TRUE] <- FALSE

然后使用[<-函数重建逻辑向量以使用rep函数进行索引,该函数基本上是rle的反函数:

x [ rep( z$values, z$lengths), "meanSR_strong"] <- 
                                   x[ rep( z$values, z$lengths), "meanSR_weak"]

print(x, n=30)
# A tibble: 30 x 5
   meanSR_strong meanSR_weak cloud.pct_strong cloud.pct_weak       date
           <dbl>       <dbl>            <dbl>          <dbl>     <dttm>
 1            NA   0.5747243       100.000000      0.0000000 2000-03-01
 2     0.3760090   0.5620306        36.036036      0.0000000 2000-03-02
 3            NA   0.5862477        98.198198      0.0000000 2000-03-03
 4            NA          NA       100.000000    100.0000000 2000-03-04
 5            NA          NA       100.000000    100.0000000 2000-03-05
 6            NA          NA       100.000000    100.0000000 2000-03-06
 7     0.6155856   0.6155856         0.000000      0.0000000 2000-03-07
 8            NA          NA       100.000000    100.0000000 2000-03-08
 9     0.6073541   0.6073541         0.000000      0.0000000 2000-03-09
10     0.5902108   0.5902108         0.000000      0.0000000 2000-03-10
11     0.5700505   0.5700505         0.000000      0.0000000 2000-03-11
12     0.5966162   0.5966162         0.000000      0.0000000 2000-03-12
13     0.5840667   0.6085108         3.603604      0.0000000 2000-03-13
14     0.5385973   0.5385973         0.000000      0.0000000 2000-03-14
15            NA          NA               NA             NA 2000-03-15
16            NA          NA               NA             NA 2000-03-16
17            NA          NA       100.000000    100.0000000 2000-03-17
18     0.5554631   0.5554631        67.567568      0.0000000 2000-03-18
19     0.3767153   0.3767153       100.000000     36.0360360 2000-03-19
20            NA          NA       100.000000     67.5675676 2000-03-20
21            NA          NA               NA             NA 2000-03-21
22            NA          NA       100.000000    100.0000000 2000-03-22
23            NA          NA       100.000000    100.0000000 2000-03-23
24            NA          NA       100.000000    100.0000000 2000-03-24
25            NA          NA       100.000000    100.0000000 2000-03-25
26     0.6097297   0.6097297        74.774775      0.9009009 2000-03-26
27            NA          NA       100.000000    100.0000000 2000-03-27
28            NA          NA       100.000000     60.3603604 2000-03-28
29     0.6390108   0.6390108         0.000000      0.0000000 2000-03-29
30     0.6342730   0.6342730         0.000000      0.0000000 2000-03-30