循环因子2级 - 每个因子和每个日期

时间:2016-10-13 11:14:44

标签: r loops factors

我有很多数据,其中我有5个变量:主题,日期,日期+小时,作为浓度的测量和喂食。

因此,对于每个受试者,我们从日期+小时(1)到日期+小时(n)进行了一些测量。因此,我们对每个主题进行了n次测量。我想做的是通过每个主题日期+小时[i] - 日期+小时1来计算每一行的记录时间。 所以为此,我做了一个循环。它工作得很好,直到我意识到每个主题都有几天的记录。所以这意味着我必须计算每个主题和每个日期,记录的时间。

这是我的剧本:

    getwd()
    setwd("H:/OptiMIR LMD files/week1")

    Week1<-read.csv("week1.csv", header=T)
    head(Week1)
    colnames(Week1)<-c("CowID","Date", "DateHour","Measure","Feeding")
    head(Week1)


    #Association colums with class
    Week1$CowID<-as.factor(Week1$CowID)
    Week1$Date<-as.Date(Week1$Date, format = "%d/%m/%Y")
    Week1$DateHour<-strptime(Week1$DateHour, format = "%Y/%m/%d/%H:%M:%S")
    Week1$Measure<-as.numeric(as.vector(Week1$Measure))
    Week1$Feeding<-as.factor(Week1$Feeding)
    str(Week1)

    summary(Week1)
    unique(Week1$CowID) 

    #Calculate Time of measure
    library(lubridate)
    library(foreach)

    Time<-c()
    #nrow(LMD)
    for (i in 1:nrow(Week1)) {
      for (j in unique(Week1$CowID)) {
        for (k in unique(Week1$Date)) {
          if (Week1$CowID[i]==j & Week1$Date[i]==k) {
            foreach(unique(Week1$CowID) & unique(Week1$Date))
            Time[i]<-c(difftime(Week1[i,3], Week1[match(k,Week1$Date),3], units="secs"))
          }
        }
      }
    }

    Week1<-cbind(Week1,Time)​

以下是总结和摘要:

> head(Week1)
  CowID       Date            DateHour Measure Feeding
1  1990 2014-01-13 2014-01-13 16:21:02     119    hoko
2  1990 2014-01-13 2014-01-13 16:21:02     116    hoko
3  1990 2014-01-13 2014-01-13 16:21:03     111    hoko
4  1990 2014-01-13 2014-01-13 16:21:03      77    hoko
5  1990 2014-01-13 2014-01-13 16:21:04      60    hoko
6  1990 2014-01-13 2014-01-13 16:21:04      65    hoko​

> summary(Week1)
     CowID            Date               DateHour                  
 2239   : 1841   Min.   :2014-01-13   Min.   :2014-01-13 14:33:05  
 2067   : 1816   1st Qu.:2014-01-13   1st Qu.:2014-01-13 16:10:14  
 2246   : 1797   Median :2014-01-14   Median :2014-01-14 15:10:51  
 2062   : 1792   Mean   :2014-01-13   Mean   :2014-01-14 14:55:45  
 2248   : 1757   3rd Qu.:2014-01-15   3rd Qu.:2014-01-15 14:32:59  
 2171   : 1738   Max.   :2014-01-15   Max.   :2014-01-15 15:55:09  
 (Other):14259                                                     
    Measure        Feeding     
 Min.   :   4.0   hoko :16857  
 1st Qu.:  65.0   strap: 8143  
 Median : 108.0                
 Mean   : 147.4                
 3rd Qu.: 185.0                
 Max.   :1521.0              ​

因此,在1990年,我将有其他记录日期。这就是我的问题,因为这个循环:

Time<-c()
for (i in 1:nrow(Week1) {
  for (j in unique(Week1$CowID)) {
    for (k in min(Week1$Date):max(Week1$Date)) {
      if ((week1$CowID[i]==j) & (Week1$Date[i]==k)) {
        Time[i]<-c(difftime(Week1[i,3], Week1[match(k, Week1$Date),3], units="secs"))
      }
    }
  }
}
当我有一天的测量/主题时,

起作用。但现在我有几天的记录,它适用于一个主题,但当涉及到另一个主题时,我有负面的记录时间......

我想我知道问题出在哪里:在循环中,&#34;对于k ...&#34;。我必须告诉R,他必须查看每个独特主题的一个日期。但我不知道该怎么做

由于

1 个答案:

答案 0 :(得分:0)

for循环是在R中按组进行操作的一种不好的方法。data.tabledplyr提供更快更友好的替代方案:

library(dplyr)
group_by(Week1, CowID, Date) %>% 
    mutate(Time = DateHour - min(DateHour))

请注意,如果您的日期时间列为POSIXlt类,那么您需要先使用POSIXct转换为as.POSIXct()