我搜索了几个选项,通常在cbind
上尝试各种组合来完成此任务。基本上我想创建一个结合了不同数据透视表的数据框。到一个数据框中以便导出到csv / excel。有没有更好的方法来实现这一目标?
编辑:基本上我正在尝试学习创建一个函数的基础知识,该函数可以包装多个不同的数据透视表以创建准备导出的数据框,作为临时报告的模板。我遇到的问题是cbind产品采用对象B,它作为一个独立的将是一个日期为列的表,并强制它进入一个长表,其中日期转换为行。
数据帧:
State FacilityName Date
NY Loew June 2014
NY Loew June 2014
CA Sunrise May 2014
CA May 2014
代码:
volume <- function() {
df$missing = ifelse(is.na(df$FacilityName), "Missing", df$FacilityName)
df = subset(df, df$missing == "Missing")
x <- function(){
a <- as.data.frame(table(df$FacilityName))
b <- table(df$FacilityName, df$date)
cbind(a, b[,1], b[2])
}
}
答案 0 :(得分:1)
当您为表函数提供一个因子时,它使用因子的级别来构建表。因此,通过添加&#34; Missing&#34;这是获得所需内容的好方法。达到&#34; FacilityName&#34;。
的水平# loading data
ec <- read.csv(text=
'State, FacilityName, Date
NY,Loew,June 2014
NY,Loew,June 2014
CA,Sunrise,May 2014
CA,NA,May 2014', )
# Adding Missing to the possible levels of FacilityName
# note that we add it in front
new.levels <- c("Missing", levels(ec$FacilityName))
ec$FacilityName <- factor(ec$FacilityName, levels=new.levels)
# And replacing NAs by the new level "Missing"
ec$FacilityName[is.na(ec$FacilityName)] <- "Missing"
# the previous line would not have worked
# if we had not added "Missing" explicitly to the levels
# table() uses the levels to generate the table
# the levels are displayed in order
# now there's a level "Missing" in first position
t <- table(ec$FacilityName, ec$Date)
你得到:
> t
June 2014 May 2014
Missing 0 1
Loew 2 0
Sunrise 0 1
您可以添加这样的总计行(我不认为您的代码nrow
按照您的说法执行)
# adding total line
rbind(t, TOTAL=colSums(as.matrix(t)))
June 2014 May 2014
Missing 0 1
Loew 2 0
Sunrise 0 1
TOTAL 2 2
此时您有一个矩阵,因此您可能希望将其传递给as.data.frame
。
如果您愿意,可以轻松将其实现为单独的功能。毕竟不需要绑定几个表:)
答案 1 :(得分:0)
好吧,所以看起来我试图变得很酷并且使用一个函数来包装所有内容,希望它是学习编写灵活代码的开始。但是,我做了很长的路,结果得到了我想要的结果。虽然我将发布下面的代码,但我非常感兴趣的是有人指着我更好地解决这些问题,以便学习更好的编码。
# Label the empty cells as Missing
ec$missing = ifelse(is.na(ec$FacilityName), "Missing", ec$FacilityName)
# Subset the dataframe to just missing values
df = subset(ec, ec$missing == "Missing")
# Create table that is a row of frequency by month for missing values
a <- table(df$missing, df$date)
# Reload dataframe to exclude Missing values
df = subset(ec, ec$missing != "Missing")
# Create table that shows frequency of observations for each facility by Month
b <- table(df$FacilityName, df$date)
# Create a Total row that can go at the bottom of the final data frame
Total <- nrow(ec)
# Bind all three objects
rbind(a,b,Total)
以下是我正在寻找的最终产品的示例:
May2014 June2014
Missing 2 0
Sunrise 0 0
Loew 1 2
Total 3 2