R循环遍历数据框列中的唯一值以根据条件创建另一个

时间:2019-01-20 21:22:37

标签: r loops for-loop iteration

我的数据集包括在多个财政年度(2013财年,14财年和15财年)以及不同地区的调查中所问问题的分数和总受访者。

我的目标是遍历<link rel="stylesheet" href="/node_modules/..."> 列并确定何时针对每个区域提出每个问题。并将此信息存储在新列中。

这是可重现的样品的样子-

FY

我首先创建ID列,方法是将testdf=data.frame(FY=c("FY13","FY14","FY15","FY14","FY15","FY13","FY14","FY15","FY13","FY15","FY13","FY14","FY15","FY13","FY14","FY15"), Region=c(rep("AFRICA",5),rep("ASIA",5),rep("AMERICA",6)), QST=c(rep("Q2",3),rep("Q5",2),rep("Q2",3),rep("Q5",2),rep("Q2",3),rep("Q5",3)), Very.Satisfied=runif(16,min = 0, max=1), Total.Very.Satisfied=floor(runif(16,min=10,max=120)), Satisfied=runif(16,min = 0, max=1), Total.Satisfied=floor(runif(16,min=10,max=120)), Dissatisfied=runif(16,min = 0, max=1), Total.Dissatisfied=floor(runif(16,min=10,max=120)), Very.Dissatisfied=runif(16,min = 0, max=1), Total.Very.Dissatisfied=floor(runif(16,min=10,max=120))) Region串联起来

QST

我的目标

1)对于每个唯一的library(tidyr) testdf = testdf %>% unite(ID,c('Region','QST'),sep = "",remove = F) ,请确定是否提出了给定问题-

a)仅在一年中(13财年,14财年或15财年)

b)过去两年(仅2015和2014财年)

c)过去三年(2015财年,14财年和13财年)

d)仅在2013财年和2015财年

我的尝试

对于这个问题,我尝试创建一个ID,对于每个唯一的for loop,我首先将问题所在的每个FY的唯一出现存储在向量ID中。然后,使用IF条件语句,根据这些情况,为新创建的v列分配注释。

Tally

循环似乎没有抛出错误消息,但是似乎并没有创建新的for (i in unique(testdf$ID)) { v=unique(testdf$FY) if(('FY15' %in% v) & ('FY14' %in% v)) { testdf$Tally=='Asked Over The Past Two Years' } else if(('FY15' %in% v) & ('FY14' %in% v) & ('FY13' %in% v)) { testdf$Tally=='Asked Over The Past Three Years' } else if(('FY13' %in% v) & ('FY15' %in% v)) { testdf$Tally=='Question Asked in FY13 & FY15 Only' } else { testdf$Tally=='Question Asked Once Only' } } 列。

任何帮助,将不胜感激。

4 个答案:

答案 0 :(得分:2)

在您的代码中,主要问题是在if-else子句中,您不是在进行赋值(使用“ <-”),而是在进行比较,使用“ ==”。我发现这是一个更优雅的解决方案,因为它不使用循环:

require(tidyverse)

testdf %>%
  select(ID, FY) %>%
  unique() %>%
  mutate(is_true = 1) %>%
  spread(key = FY, value = is_true, fill = 0) %>%
  mutate(tally = case_when(
    FY13 == 1 & FY14 == 1 & FY15 == 1 ~ 'Asked Over The Past Three Years',
                FY14 == 1 & FY15 == 1 ~ 'Asked Over the Past Two Years',
    FY13 == 1 &             FY15 == 1 ~ 'Asked in FY12 & FY15 Only',
    TRUE ~ 'Question Asked Once Only'
  ))

输出:

+------------------------------------------------------------+
|          ID FY13 FY14 FY15                           tally |
+------------------------------------------------------------+
| 1  AFRICAQ2    1    1    1 Asked Over The Past Three Years |
| 2  AFRICAQ5    0    1    1   Asked Over the Past Two Years |
| 3 AMERICAQ2    1    1    1 Asked Over The Past Three Years |
| 4 AMERICAQ5    1    1    1 Asked Over The Past Three Years |
| 5    ASIAQ2    1    1    1 Asked Over The Past Three Years |
| 6    ASIAQ5    1    0    1       Asked in FY12 & FY15 Only |
+------------------------------------------------------------+

答案 1 :(得分:1)

无需循环:

library(tidyverse)

result <- testdf %>%
    select(3, 2, 1) %>%
    mutate(Asked = 1) %>%
    spread(FY, Asked)

> result
  QST  Region FY13 FY14 FY15
1  Q2  AFRICA    1    1    1
2  Q2 AMERICA    1    1    1
3  Q2    ASIA    1    1    1
4  Q5  AFRICA   NA    1    1
5  Q5 AMERICA    1    1    1
6  Q5    ASIA    1   NA    1

一次性回答所有四个问题。

如果您真的想要一个提示栏,请按以下方式展开:

result %>%
    mutate(Tally = case_when(FY13 + FY14 + FY15 == 1 ~ "Only one year",
                             FY13 + FY14 + FY15 == 3 ~ "Past three years",
                             FY14 + FY15 == 2 ~ "Past two years",
                             FY13 + FY15 == 2 ~ "FY13 and FY15 only",
                             NA ~ NA_character_))

  QST  Region FY13 FY14 FY15              Tally
1  Q2  AFRICA    1    1    1   Past three years
2  Q2 AMERICA    1    1    1   Past three years
3  Q2    ASIA    1    1    1   Past three years
4  Q5  AFRICA   NA    1    1     Past two years
5  Q5 AMERICA    1    1    1   Past three years
6  Q5    ASIA    1   NA    1 FY13 and FY15 only

答案 2 :(得分:0)

请考虑ave,以便根据嵌套条件ifelse内的区域 QST 进行分组计算:

testdf <- within(testdf, {
                   FY13 <- ifelse(FY=='FY13', 1, 0)
                   FY14 <- ifelse(FY=='FY14', 1, 0)
                   FY15 <- ifelse(FY=='FY15', 1, 0)

                   Tally <- ifelse(ave(FY13, Region, QST, FUN=max) + ave(FY14, Region, QST, FUN=max) + ave(FY15, Region, QST, FUN=max) == 1,
                                   'Asked Only on One Year',
                                   ifelse(ave(FY13, Region, QST, FUN=max) + ave(FY14, Region, QST, FUN=max) + ave(FY15, Region, QST, FUN=max) == 3,
                                          'Asked Over the Past Three Years',
                                          ifelse(ave(FY14, Region, QST, FUN=max) + ave(FY15, Region, QST, FUN=max) == 2,
                                                 'Asked Over the Past Two Years',
                                                 ifelse(ave(FY13, Region, QST, FUN=max) + ave(FY15, Region, QST, FUN=max) == 2,
                                                        'Asked On FY13 & FY15 Only',
                                                        NA
                                                        )
                                                 )
                                          )
                                   )

                   FY13 <- NULL; FY14 <- NULL; FY15 <- NULL
             })

testdf[c("ID", "FY", "Tally")]

#     Region QST   FY                           Tally
# 1   AFRICA  Q2 FY13 Asked Over the Past Three Years
# 2   AFRICA  Q2 FY14 Asked Over the Past Three Years
# 3   AFRICA  Q2 FY15 Asked Over the Past Three Years
# 4   AFRICA  Q5 FY14   Asked Over the Past Two Years
# 5   AFRICA  Q5 FY15   Asked Over the Past Two Years
# 6     ASIA  Q2 FY13 Asked Over the Past Three Years
# 7     ASIA  Q2 FY14 Asked Over the Past Three Years
# 8     ASIA  Q2 FY15 Asked Over the Past Three Years
# 9     ASIA  Q5 FY13       Asked On FY13 & FY15 Only
# 10    ASIA  Q5 FY15       Asked On FY13 & FY15 Only
# 11 AMERICA  Q2 FY13 Asked Over the Past Three Years
# 12 AMERICA  Q2 FY14 Asked Over the Past Three Years
# 13 AMERICA  Q2 FY15 Asked Over the Past Three Years
# 14 AMERICA  Q5 FY13 Asked Over the Past Three Years
# 15 AMERICA  Q5 FY14 Asked Over the Past Three Years
# 16 AMERICA  Q5 FY15 Asked Over the Past Three Years

答案 3 :(得分:0)

有一种使用您的ID列的解决方案。 (尽管使用paste0,但使用testdf$ID <- paste0(testdf$Region, "_", testdf$QST)可以做得更好。)

我们使用dcasttestdf reshape2您的library(reshape2) tmp <- dcast(testdf, ID ~ FY, value.var="QST", fun.aggregate=length)

tmp <- cbind(tmp, 
             past2=as.numeric(t2[3] + t2[4] == 2 & t2[2] == 0), 
             past3=as.numeric(t2[2] + t2[3] + t2[4] == 3),
             y13_15=as.numeric(t2[2] + t2[4] == 2 & t2[3] == 0))

现在我们已经知道问题是否在不同年份提出。为了回答其他问题,我们将做一些数学运算。

Tally

5:7列中的序列包含我们可以挤奶的所需tmp$Tally <- apply(tmp, 1, function(x) paste0(x[5:7], collapse="")) 信息

tmp$Tally <- factor(tmp$Tally, labels=c('Question Asked Once Only',
                                        'Question Asked in FY13 & FY15 Only',
                                        'Asked Over The Past Three Years',
                                        'Asked Over The Past Two Years'))

按因子水平翻译成人类语言,

> merge(testdf, t3[c(1, 8)])
             ID   FY    Region QST                              Tally
1     AFRICA_Q2 FY13    AFRICA  Q2    Asked Over The Past Three Years
2     AFRICA_Q2 FY14    AFRICA  Q2    Asked Over The Past Three Years
3     AFRICA_Q2 FY15    AFRICA  Q2    Asked Over The Past Three Years
4     AFRICA_Q5 FY14    AFRICA  Q5      Asked Over The Past Two Years
5     AFRICA_Q5 FY15    AFRICA  Q5      Asked Over The Past Two Years
6    AMERICA_Q2 FY13   AMERICA  Q2    Asked Over The Past Three Years
7    AMERICA_Q2 FY14   AMERICA  Q2    Asked Over The Past Three Years
8    AMERICA_Q2 FY15   AMERICA  Q2    Asked Over The Past Three Years
9    AMERICA_Q5 FY13   AMERICA  Q5    Asked Over The Past Three Years
10   AMERICA_Q5 FY14   AMERICA  Q5    Asked Over The Past Three Years
11   AMERICA_Q5 FY15   AMERICA  Q5    Asked Over The Past Three Years
12 ANTH.CTRY_Q2 FY15 ANTH.CTRY  Q2           Question Asked Once Only
13      ASIA_Q2 FY13      ASIA  Q2    Asked Over The Past Three Years
14      ASIA_Q2 FY14      ASIA  Q2    Asked Over The Past Three Years
15      ASIA_Q2 FY15      ASIA  Q2    Asked Over The Past Three Years
16      ASIA_Q5 FY13      ASIA  Q5 Question Asked in FY13 & FY15 Only
17      ASIA_Q5 FY15      ASIA  Q5 Question Asked in FY13 & FY15 Only

并与原始数据帧合并以获得所需的结果。

结果

testdf <- structure(list(FY = c("FY13", "FY14", "FY15", "FY14", "FY15", 
"FY13", "FY14", "FY15", "FY13", "FY15", "FY13", "FY14", "FY15", 
"FY13", "FY14", "FY15", "FY15"), Region = c("AFRICA", "AFRICA", 
"AFRICA", "AFRICA", "AFRICA", "ASIA", "ASIA", "ASIA", "ASIA", 
"ASIA", "AMERICA", "AMERICA", "AMERICA", "AMERICA", "AMERICA", 
"AMERICA", "ANTH.CTRY"), QST = c("Q2", "Q2", "Q2", "Q5", "Q5", 
"Q2", "Q2", "Q2", "Q5", "Q5", "Q2", "Q2", "Q2", "Q5", "Q5", "Q5", 
"Q2")), row.names = c(NA, 17L), class = "data.frame")

数据

 HTML:
//webgazer.js Library
<script type="text/js" src="libraries/WebGazer/webgazer.js"></script>
//Three.js
<script type="text/javascript" src="libraries/Three.js/Three.js"></script>
<script type="text/javascript" src="libraries/Three.js/three.min.js"></script>
<script type="text/javascript" src="libraries/Three.js/OrbitControls.js"></script>
<!-- MY SCRIPT -->
<script type="text/javascript" src="scripts/setupPage.js"></script>
<script type="text/javascript" src="scripts/3DCanvas.js"></script>
<script type="text/javascript" src="scripts/faceFeatures.js"></script>
<script type="text/javascript" src="scripts/modals.js"></script>


 I tried all the ways i found online on how to import JS libraries but no luck

import * as webgazer from 'libraries/WebGazer/webgazer.js';
import * as lib from 'libraries/WebGazer/webgazer.js';
import * from 'libraries/WebGazer/webgazer.js';
import * from 'libraries/WebGazer/webgazer.js';
import {webgazer} from './libraries/WebGazer/webgazer.js'; 
import {webgazer} from 'webgazer';
import {webgazer} from 'webgazer.js';
import function name(parameters) {webgazer} from 'libraries/WebGazer/webgazer.js';
var webgezer = require ( 'libraries/WebGazer/webgazer.js');