dplyr:组变量然后根据唯一分组分配唯一名称

时间:2017-12-03 23:10:25

标签: r dplyr

我有一个像这样的数据框:

obs_network

NA date loc我想为每个唯一的output<- data.frame(date= c(rep("10-29-16", 3), rep("11-14-16", 2), "12-29-16","10-2-17","9-2-17"), loc= c(rep("A", 3), rep("B", 2),"A","PlotA","PlotB"), obs_network= c(rep("pseudoplot_1", 3),rep("pseudoplot_2", 2),"pseudoplot_3","PlotA","PlotB")) output<- df %>% group_by(date, loc)%>% mutate(obs_network=ifelse(is.na(obs_network), paste0("pseudoplot", "_", match(loc, unique (loc))), obs_network)) 组合命名。我希望为这个命名方案为唯一的组分配一个唯一的编号和前缀“pseudoplot”。所以输出看起来像这样:

var myForm = document.getElementById('needs-validation');
myForm.addEventListener('submit', showLoader);
function showLoader(e) {
  this.querySelector('.loader-container').style.display = 'block';
  // the line below is just for the demo, it stops the form from submitting
  // so that you can see it works. Don't use it
  e.preventDefault();
}

我尝试了以下但没有成功,我无法识别我的错误。使用下面的代码读取所有级别“pseudoplot1”。如果有人解释为什么我的代码除了提供解决方案之外没有工作,我将不胜感激。

#needs-validation {
      /* .loader-container will be positionned relative to this */
      position: relative;
    }
    
    .loader-container {
        position: fixed;
        top: 0;
        left: 0;
        width: 100%;
        height: 100%;
    }
    .loader {
        display: block;
        position: relative;
        left: 50%;
        top: 50%;
        width: 150px;
        height: 150px;
        margin: -75px 0 0 -75px;
        border-radius: 50%;
        border: 3px solid transparent;
        border-top-color: #9370DB;
        -webkit-animation: spin 2s linear infinite;
        animation: spin 2s linear infinite;
    }
    .loader:before {
        content: "";
        position: absolute;
        top: 5px;
        left: 5px;
        right: 5px;
        bottom: 5px;
        border-radius: 50%;
        border: 3px solid transparent;
        border-top-color: #BA55D3;
        -webkit-animation: spin 3s linear infinite;
        animation: spin 3s linear infinite;
    }
    .loader:after {
        content: "";
        position: absolute;
        top: 15px;
        left: 15px;
        right: 15px;
        bottom: 15px;
        border-radius: 50%;
        border: 3px solid transparent;
        border-top-color: #FF00FF;
        -webkit-animation: spin 1.5s linear infinite;
        animation: spin 1.5s linear infinite;
    }
    @-webkit-keyframes spin {
        0%   {
            -webkit-transform: rotate(0deg);
            -ms-transform: rotate(0deg);
            transform: rotate(0deg);
        }
        100% {
            -webkit-transform: rotate(360deg);
            -ms-transform: rotate(360deg);
            transform: rotate(360deg);
        }
    }
    @keyframes spin {
        0%   {
            -webkit-transform: rotate(0deg);
            -ms-transform: rotate(0deg);
            transform: rotate(0deg);
        }
        100% {
            -webkit-transform: rotate(360deg);
            -ms-transform: rotate(360deg);
            transform: rotate(360deg);
        }
    }

2 个答案:

答案 0 :(得分:1)

这是我能想到的。有条件:1)date是日期对象,2)locobs_network是字符向量。我在下面创建一个示例。 date是日期对象,locobs_network是字符向量。

         date   loc obs_network
1  2016-10-29     A        <NA>
2  2016-10-29     A        <NA>
3  2016-10-29     A        <NA>
4  2016-11-14     B        <NA>
5  2016-11-14     B        <NA>
6  2016-12-29     A        <NA>
7  2017-10-02 PlotA       PlotA
8  2017-09-02 PlotB       PlotB
9  2017-10-10     A        <NA>
10 2017-10-10     B        <NA>

我用了两件事。一个是我使用了两个日期之间的差异。另一个是我使用差异来为cumsum()的唯一日期创建唯一的组号。通过粘贴唯一的组号和loc,我创建了独特的组。

mydf %>%
mutate(obs_network = if_else(is.na(obs_network), 
                             paste0("pseudoplot_", cumsum(c(T, abs(diff(date)) > 0)), loc, sep = ""),
                             obs_network))


#         date   loc   obs_network
#1  2016-10-29     A pseudoplot_1A
#2  2016-10-29     A pseudoplot_1A
#3  2016-10-29     A pseudoplot_1A
#4  2016-11-14     B pseudoplot_2B
#5  2016-11-14     B pseudoplot_2B
#6  2016-12-29     A pseudoplot_3A
#7  2017-10-02 PlotA         PlotA
#8  2017-09-02 PlotB         PlotB
#9  2017-10-10     A pseudoplot_6A
#10 2017-10-10     B pseudoplot_6B

DATA

mydf <- structure(list(date = structure(c(17103, 17103, 17103, 17119, 
17119, 17164, 17441, 17411, 17449, 17449), class = "Date"), loc = c("A", 
"A", "A", "B", "B", "A", "PlotA", "PlotB", "A", "B"), obs_network = c(NA, 
NA, NA, NA, NA, NA, "PlotA", "PlotB", NA, NA)), .Names = c("date", 
"loc", "obs_network"), row.names = c(NA, -10L), class = "data.frame")

答案 1 :(得分:0)

一些注意事项:

  1. 您已在数据框中添加了v4 - 因此这些文本(实际上是因素)实际上不是"NA"值。我建议您更改原始数据帧。

    NA
  2. 使用因子(您在数据库中创建的内容)和使用ifelse的字符向量或整数会出现问题。我已将数据集更改为df <- tibble(date= c(rep("10-29-16", 3), rep("11-14-16", 2),"12-29-16","10-2-17","9-2-17"), loc= c(rep("A", 3), rep("B", 2), "A", "PlotA", "PlotB"), obs_network= c(rep(NA, 6), "PlotA", "PlotB")) ,以便所有内容都是字符并使用tibble

  3. 最后不要使用if_else来保持一切平稳

    group_by