需要帮助将函数的输入复制为R中另一个函数的输入

时间:2014-07-31 01:13:57

标签: r data.table

我需要帮助确定如何使用下面函数的输入作为另一个r文件的输入。

Hotel <- function(hotel) {

  require(data.table)

  dat <- read.csv("demo.csv", header = TRUE)

  dat$Date <- as.Date(paste0(format(strptime(as.character(dat$Date), 
                                             "%m/%d/%y"), 
                                    "%Y/%m"),"/1"))

  library(data.table)
  table <- setDT(dat)[, list(Revenue = sum(Revenues),
                             Hours = sum(Hours),
                             Index = mean(Index)), 
                      by = list(Hotel, Date)]


  answer <- na.omit(table[table$Hotel == hotel, ])

  if (nrow(answer) == 0) {
    stop("invalid hotel")
  }

  return(answer)
}

我会输入酒店(“酒店名称”)

这是使用我在上面输入的酒店名称的另一个R文件。

#Reads the dataframe from the Hotel Function
star <- (Hotel("Hotel Name")) 

#Calculates the Revpolu and Index
Revpolu <- star$Revenue / star$Hours
Index <- star$Index

png(filename = "~/Desktop/result.png", width = 480, height= 480)
plot(Index, Revpolu, main = "Hotel Name", col = "green", pch = 20)

testing <- cor.test(Index, Revpolu)

write.table(testing[["p.value"]], file = "output.csv", sep = ";", row.names = FALSE, col.names = FALSE)
dev.off()

我希望这部分变为自动化,而不必从第一个文件复制和粘贴输入,然后将其存储为变量。或者,如果它更容易,那么所有这一切只是一个功能。

此外,您无需为该功能输入一个酒店名称。是否可以使第一个文件读取所有酒店名称(如果它们在.csv文件中被标识为行名称并且在第二个文件中读取该输入?)

1 个答案:

答案 0 :(得分:1)

由于您的示例不可复制且您的代码存在一些错误(使用列&#34;会议室&#34;这不是由您的功能生成的),我无法为您提供经过测试的答案,但此处和#39; s如何构建代码以生成所有酒店所需的统计数据,而无需复制和粘贴酒店名称:

library(data.table)

# Use fread instead of read.csv, it's faster
dat <- fread("demo.csv", header = TRUE)

dat[, Date := as.Date(paste0(format(strptime(as.character(Date), "%m/%d/%y"), "%Y/%m"),"/1"))

table <- dat[, list(
  Revenue = sum(Revenues),
  Hours = sum(Hours),
  Index = mean(Index)
  ), by = list(Hotel, Date)]

# You might want to consider using na.rm=TRUE in cor.test instead of 
# using na.omit, but I kept it here to keep the result similar.
answer <- na.omit(table)

# Calculate Revpolu inside the data.table
table[, Revpolu := Revenue / Hours]

# You can compute a p-value for all hotels using a group by
testing <- table[, list(p.value = cor.test(Index, Revpolu)[["p.value"]]), by=Hotel]
write.table(testing, file = "output.csv", sep = ";", row.names = FALSE, col.names = FALSE)

# You can get individual plots for each hotel with a for loop
hotels <- unique(table$Hotel)
for (h in hotels) {
  png(filename = "~/Desktop/result.png", width = 480, height= 480)
  plot(table[Hotel == h, Index], table[Hotel == h, Revpolu], main = h, col = "green", pch = 20)
  dev.off()  
}