数据框输出与预期输出不匹配

时间:2020-02-11 09:31:25

标签: r

我有一个函数,该函数带有一个数据框并提供基本的摘要统计信息。我的问题是,函数输出与预期输出不匹配。

# create my data frame
x = c(55.3846, 54.5385, 54.1538, 54.8205, 54.7692, 54.7179)
y = c(47.1795, 47.0256, 47.4872, 47.4103, 47.3333, 47.8718)
df = data.frame(x,y)

# create function to create summary statistics
xy_stats <- function(data) {
x_mean <- mean(data$x)
y_mean <- mean(data$y)
x_sd <- sd(data$x)
y_sd <- sd(data$y)
corr <- cor(data$x,data$y, method = "pearson")
xydata <- data.frame(x_mean, y_mean, x_sd, y_sd, corr)
return(xydata)
}

# test function on data frame
df_results <- xy_stats(df)

这将产生输出:

> xy_stats(df)
    x_mean   y_mean      x_sd      y_sd       corr
1 54.73075 47.38462 0.4017586 0.2905615 -0.2230826

然后我创建预期的输出:

# create test data (expected output)
test_data <- c(
            "x_mean" = 54.26,
            "y_mean" = 47.83,
            "x_sd" = 0.46,
            "y_sd" = 0.29,
            "corr" = -0.265
         )

外观如下:

> test_data
x_mean y_mean   x_sd   y_sd   corr 
54.260 47.830  0.460  0.290 -0.265

然后我比较函数输出和预期输出:

library(testthat)
expect_equal(df_results,test_data,tolerance=1)

输出如下:

Error: `df_results` not equal to `test_data`.
Modes: list, numeric
Attributes: < names for target but not for current >
Attributes: < Length mismatch: comparison on first 0 components >

我无法调整预期的(test_data)结果,但可以调整函数以创建与预期结果匹配的输出。我可以看到test_data的类是数字,而df结果的类是data.frame,但是我不知道如何使函数产生的结果成为数字。我尝试替换代码中的以下内容,但不起作用:

# Replace:
xydata <- data.frame(x_mean, y_mean, x_sd, y_sd, corr)
# with:
xydata <- data.frame(as.numeric(x_mean, y_mean, x_sd, y_sd, corr))

2 个答案:

答案 0 :(得分:1)

您可以像下面这样取消将df_results列为数组

expect_equal(unlist(df_results),test_data,tolerance=1)

没有错误消息

答案 1 :(得分:0)

您的问题是test_data是命名向量,而xy_stats的输出是数据帧。

为什么不只将xy_stats的输出命名为矢量?

xy_stats <- function(data) 
{
  c("x_mean" = mean(data$x), 
    "y_mean" = mean(data$y), 
    "x_sd"   = sd(data$x), 
    "y_sd"   = sd(data$y), 
    "corr"   = cor(data$x, data$y, method = "pearson"))
}

现在,当您这样做

df_results <- xy_stats(df)

test_data <- c(
            "x_mean" = 54.26,
            "y_mean" = 47.83,
            "x_sd" = 0.46,
            "y_sd" = 0.29,
            "corr" = -0.265
         )

expect_equal(df_results, test_data, tolerance = 1)

它顺利通过