R:UseMethod(“group_by_”)中的错误:应用于类的对象

时间:2018-01-16 21:25:19

标签: r csv dplyr

我正在编写一个我应该在csv文件中读取的赋值,然后将其发送到要转换为类对象的函数。我设法在csv文件中读取并通过执行以下操作将其转换为对象:

make_LD <- function(x){
  structure(list(id = c(x$id), visit = c(x$visit),
                 room = c(x$room), value = c(x$value), timepoint = c(x$timepoint)), class = "LongitudinalData")
}

输入CSV文件的可重现版本为:

data <- structure(list(id = c(14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 
14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L), 
    visit = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), room = c("bedroom", "bedroom", 
    "bedroom", "bedroom", "bedroom", "bedroom", "bedroom", "bedroom", 
    "bedroom", "bedroom", "bedroom", "bedroom", "bedroom", "bedroom", 
    "bedroom", "bedroom", "bedroom", "bedroom", "bedroom", "bedroom"
    ), value = c(6, 6, 2.75, 2.75, 2.75, 2.75, 6, 6, 2.75, 2.75, 
    2.75, 2.75, 2.75, 2.75, 2.75, 2.75, 2.75, 2.75, 2.75, 2.75
    ), timepoint = 53:72), .Names = c("id", "visit", "room", 
"value", "timepoint"), class = "data.frame", row.names = c(NA, 
-20L))

我如何运行此代码:

## Read in the data
library(readr)
library(magrittr)
library(dplyr)
source("oop_code_2.R")
## Load any other packages that you may need to execute your code

data <- read_csv("data/MIE.csv")
x <- make_LD(data)
out <- subject(x, 14)

完成此操作后,我使用对象变量并将其发送到通用函数:

subject <- function(x, id) UseMethod("subject")
subject.LongitudinalData <- function(x, subj){
  subj_exist <- x %>%
    group_by_(x$id) %>%
    filter(x$id == subj)
  return(subj_exist)
}

当我运行代码时,会产生错误:

Error in UseMethod("group_by_") : 
  no applicable method for 'group_by_' applied to an object of class "LongitudinalData"

我注意到读入的csv文件的格式被组织成列,其中我将其发送为对象后的数据格式已更改为字段。

问题是,我做错了什么? 谢谢!

编辑/加了:

当我对来自csv的数据运行此代码时,它可以正常工作,如下所示。如果这有帮助。

> datatest1 <- data %>%
+ group_by(id, visit, room) %>%
+ select(id, visit, room , value) %>%
+ filter(id == 14) %>%
+ summarise(valmean = mean(value))
> print(datatest1)
# A tibble: 6 x 4
# Groups:   id, visit [?]
     id visit         room   valmean
  <int> <int>        <chr>     <dbl>
1    14     0      bedroom  4.786592
2    14     0  living room  2.750000
3    14     1      bedroom  3.401442
4    14     1 family  room  8.426549
5    14     2      bedroom 18.583635
6    14     2  living room 22.550694

在LongitudinalData对象上完成后,会抛出错误:

> datatest2 <- x %>%
+ group_by(id, visit, room) %>%
+ select(id, visit, room , value) %>%
+ filter(id == 14) %>%
+ summarise(valmean = mean(values))
Error in UseMethod("group_by_") : 
  no applicable method for 'group_by_' applied to an object of class "LongitudinalData"

也可能来自数据格式的方式。以下是数据转换为LongitudinalData对象之前和之后数据的示例。

> head(data)
# A tibble: 6 x 5
     id visit    room value timepoint
  <int> <int>   <chr> <dbl>     <int>
1    14     0 bedroom  6.00        53
2    14     0 bedroom  6.00        54
3    14     0 bedroom  2.75        55
4    14     0 bedroom  2.75        56
5    14     0 bedroom  2.75        57
6    14     0 bedroom  2.75        58
> head(x)
$id
   [1] 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14
  [40] 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14
  [79] 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14
$visit
   [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  [60] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [119] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

以下是数据的链接:data

3 个答案:

答案 0 :(得分:1)

在将class属性设置为"LongitudinalData"时,您告诉R仅使用.LongitudinalData的方法。与您一样,当您执行subject.LongitudinalData时,如何定义被调用的subject(x, 14),当您致电group_by_.LongitudinalData时,R会查找group_by_,但当然,因为你刚刚发明了这门课,所以不存在。

但是,R具有类似继承的简单功能,因此如果主类没有方法,则可以指定备份类。

来自?class

  

当一个通用函数fun应用于具有类属性c的​​对象(&#34; first&#34;,&#34; second&#34;)时,系统会搜索名为fun.first的函数,如果它找到它,将它应用于对象。如果没有找到这样的函数,则尝试一个名为fun.second的函数。如果没有类名生成合适的函数,则使用函数fun.default(如果存在)。如果没有class属性,则尝试隐式类,然后使用默认方法。

因此,您可以指定您的LongitudinalData对象也可以像数据框一样对待:

make_LD <- function(x){
  structure(list(id = c(x$id), visit = c(x$visit),
                 room = c(x$room), value = c(x$value), timepoint = c(x$timepoint)), 
  class = c("LongitudinalData", "data.frame"))
}

但是,缺少一些数据框的附加结构,因此通过构建现有对象而不是从头开始创建新类通常更好:

make_LD <- function (x) {
  class(x) <- c("LongitudinalData", class(x))
  x
}

请注意,您的subject.LongitudinalData方法还有一些其他问题需要在运行之前进行纠正。我建议阅读vignette("programming", package = "dplyr")

答案 1 :(得分:0)

我将您在评论中添加的csv文件作为您使用的来源阅读:

#replace you path for csv file as it is in your computer
df <- read.csv("C:/Users/username/Desktop/_257dbf6be13177cd110e3ef91b34ff67_data/data/MIE.csv", header=TRUE, sep=",", stringsAsFactors=FALSE)

我运行代码:

make_LD <- function(x){
  structure(list(id = c(x$id), visit = c(x$visit),
                 room = c(x$room), value = c(x$value), timepoint = c(x$timepoint)), class = "LongitudinalData")
}

subject <- function(x, id) UseMethod("subject")
subject.LongitudinalData <- function(x, subj){
  subj_exist <- x %>%
    group_by_(x$id) %>%
    filter(x$id == subj)
  return(subj_exist)
}

我没有收到任何错误。

答案 2 :(得分:0)

我也在做同样的任务,我正在使用S4课程。

这是我的解决方案:

library(dplyr)

setClass('longitudinalData', 
         representation = representation(
                        id = "numeric", 
                        visit = "numeric",
                        room = "character",
                        value = "numeric",
                        timepoint = 'numeric')
         )


data = data.frame( id = rbinom(1000, 10, .75),
                   visit = sample(1:3, 1000, replace = TRUE),
                   room = sample(letters[1:5], 1000, replace = TRUE),
                   value = rnorm(1000, 50, 10),
                   timepoint = abs(rnorm(1000))
)

make_LD = function(data){
  new("longitudinalData", 
      id = as.numeric(data$id),
      visit = as.numeric(data$visit), 
      room = as.character(data$room),
      value = as.numeric(data$value),
      timepoint =as.numeric(data$timepoint))
}

x = make_LD (data)
print(x)

setGeneric(name = 'subject', def = function(.Object, n=1){standardGeneric('subject')})

setMethod(f='subject', signature = 'longitudinalData',
          definition = function(.Object, n=1) {
            if(n %in% .Object@id){
            x = data.frame(as.factor(.Object@id), as.factor(.Object@visit), .Object@room, .Object@value, .Object@timepoint)
            names(x) = c( 'id', 'visit', 'room', 'value', 'timepoint')
            out = x[which(x$id == n),] %>% group_by(visit)
            return(out)
            } else { stop(paste("Subject", n, "is not available", sep = " "))}
          })


subject(x, n=4) %>% summary

请注意,我在setMethod中使用了data.frame,以便我可以在其上使用已知(有效)dplyr函数。

输出如下:

> subject(x, n=4) %>% summary
       id     visit room      value         timepoint      
 4      :12   1:2   a:2   Min.   :25.04   Min.   :0.02548  
 2      : 0   2:4   b:2   1st Qu.:44.80   1st Qu.:0.20043  
 3      : 0   3:6   c:1   Median :50.42   Median :0.53025  
 5      : 0         d:3   Mean   :47.73   Mean   :0.71829  
 6      : 0         e:4   3rd Qu.:52.44   3rd Qu.:1.13632  
 7      : 0               Max.   :64.83   Max.   :1.88971  
 (Other): 0 

id字段的输出效果不佳。我认为可以轻松修复。

随意编辑这方面的答案。

希望它有所帮助!!