Question

我曾经使用SAS和SQL，之前我正试图通过一门课程进入R.我的导师已经完成了以下任务：

“使用Iris数据集，编写一个R函数，该函数将Iris种类和属性名称作为其参数，并返回该种类属性的最小值和最大值。”

起初听起来很简单，但是我试图完成这个功能。以下就我而言

 #write the function
question_2 <- function(x, y, data){
new_table <- subset(data, Species==x)
themin <-min(new_table$y)
themax <-max(new_table$y)
return(themin)
return(themax)}
#test the function - Species , Attribute, Data
question_2("setosa",Sepal.Width, iris)

我认为在运行该函数时我需要在物种周围引用，但是我得到一个错误，那就是＆＃34;没有非缺失的参数到min / max＆＃34;，我猜测这意味着我的试图制作“new_table”＃39;已经带回零观察。

谁能看到我出错的地方？

编辑：感谢所有人迅速而富有洞察力的回应。我会把这个阅读带到船上。再次感谢！

Answer 1

事实上，你的老师并没有给你最简单的事情。你几乎是对的。你不能在一个函数中返回两次。

question_2 <- function(x, y, data){
new_table <- subset(data, Species==x)
themin <-min(new_table[[y]])
themax <-max(new_table[[y]])
return(list(themin, themax))}

question_2("setosa","Sepal.Width", iris)

Answer 2

df$colname不能与$右侧的变量一起使用，因为它会搜索名为"colname"的列（在您的情况下为"y"），而不是字符变量colname（如果它存在）表示。

语法df[["colname"]]在这种情况下很有用，因为它允许字符输入（也可以是表示字符的变量）。这适用于对象类型list和data.frame。实际上，data.frame可以看作是向量列表。

示例

df <- data.frame(col1 = 5:7, col2 = letters[1:3]) a <- "col1" # $ vs [[ df$col1 # works because "col1" is a column of df df$a # does not work because "a" is not a column of df df[["col1"]] # works because "col1" is a column of df df[[a]] # works because "col1" is a column of df # dataframes can be seen as list of vectors ls <- list(col1 = 5:7, col2 = letters[1:3]) ls$col1 # works ls[[a]] # works

Answer 3

一个问题是Sepal.Width似乎是工作空间中的某个对象。否则R会对你大喊Object "Sepal.Width" not found.。无论Sepal.Width（对象）是什么，它可能不是值为"Sepal.Width"的字符串。但即使它是，R也不知道如何使用$运算符从new_table获取该命名列，而不是没有一些不必要的高级编程。 @ Flo.P建议使用[[是一个很好的建议。

您必须将y作为"Sepal.Width"传递。

另一种方法：您可以通过编写此代码来利用subset：

question_2 <- function(x, y, data){
newy <- subset(data, subset=Species==x, select=y)
themin <-min(newy)
themax <-max(newy)
return(c(themin, themax))

}

question_2("setosa","Sepal.Width", iris)

函数中的R字符变量

3 个答案: