我是R的新手,如果这是一个基本问题,请道歉。我有一个data.frame,其中所有14个变量都是因子。我有11个测量变量,它们是10个不同级别中的1个,但并非所有变量都包含所有级别。我想创建一个频率表,将测量变量作为列,将水平作为所有11个测量变量的行。
我的data.frame的结构如下:
Publication: Factors.. 2 levels
Year: Factors... 5 levels
Energy Type: Factors... 6 levels
AQ: Factors... 3 levels
CA: Factors... 9 levels
CCM: Factors... 8 levels
FFR: Factors: 5 levels
(我的表格在显示窗口中没有正确格式化)
我希望实现一个频率表,包括所有测量变量(例如AQ,CA,CCM,FFR)作为列,而级别作为行插入NA,其中变量不包括特定级别。< / p>
我首先尝试创建多个表然后使用rbind-但并非所有测量变量都包含所有级别 - 因此表格不准确。我试过rbind.fill,它要求输入是data.frames(而不是表格),但是这导致将表格转换为data.frames有困难...我也尝试过重塑和转换数据,但我没有&#39 ;认为重塑是我问题的解决方案......
我感谢任何帮助是解决这个问题的最佳方法 米歇尔
以下是我的表格示例:
Publication Year AQ CCM CA
Bangor Daily News 2006 No No No
Bangor Daily News 2006 No No R1
Bangor Daily News 2006 No No C1
Bangor Daily News 2006 No No No
Bangor Daily News 2006 No No C1
答案 0 :(得分:0)
不确定预期结果:
dat1 <- structure(list(Publication = structure(c(1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L), .Label = c("1", "2"), class = "factor"),
Year = structure(c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L
), .Label = c("2004", "2005", "2006", "2007", "2008"), class = "factor"),
AQ = structure(c(5L, 5L, 2L, 5L, 4L, 3L, 4L, 1L, 4L, 4L), .Label = c("a",
"b", "c", "d", "e"), class = "factor"), CA = structure(c(3L,
4L, 5L, 2L, 3L, 5L, 5L, 1L, 3L, 3L), .Label = c("b", "c",
"d", "e", "f"), class = "factor"), CCM = structure(c(4L,
1L, 4L, 4L, 1L, 3L, 2L, 4L, 3L, 4L), .Label = c("d", "e",
"f", "h"), class = "factor")), .Names = c("Publication",
"Year", "AQ", "CA", "CCM"), row.names = c(NA, -10L), class = "data.frame")
从measured variables
Un <- unique(unlist(dat1[,3:5]))
在将级别设置为table
Un
获取每列的频率
res <- sapply(dat1[,3:5], function(x) table(factor(x, levels=levels(Un))))
res[!res] <- NA # change 0's to NA
res
# AQ CA CCM
# a 1 NA NA
# b 1 1 NA
# c 1 1 NA
# d 4 4 2
# e 3 1 1
# f NA 3 2
# h NA NA 5