是否可以仅为分裂变量的某些值返回ddply结果?例如,使用数据框example
:
example <- structure(list(shape = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 3L), .Label = c("circle", "square", "triangle"
), class = "factor"), property = structure(c(1L, 3L, 2L, 1L,
2L, 3L, 1L, 1L, 1L, 1L, 2L, 3L, 1L, 1L), .Label = c("color",
"intensity", "size"), class = "factor"), value = structure(c(5L,
2L, 1L, 5L, 4L, 1L, 5L, 6L, 6L, 7L, 4L, 3L, 6L, 5L), .Label = c("3",
"5", "6", "7", "blue", "green", "red"), class = "factor")), .Names = c("shape",
"property", "value"), class = "data.frame", row.names = c(NA,
-14L))
看起来像这样
shape property value
1 circle color blue
2 circle size 5
3 circle intensity 3
4 circle color blue
5 square intensity 7
6 square size 3
7 square color blue
8 square color green
9 square color green
10 triangle color red
11 triangle intensity 7
12 triangle size 6
13 triangle color green
14 triangle color blue
我想返回一个数据框,其中包含每种具有特定颜色的形状的数量,如下所示:
shape property blue green red
1 circle color 2 0 0
2 square color 1 2 0
3 triangle color 1 1 1
然而,我似乎无法让这个正确归还!我已经使用了类似的东西:
ColorSummary <- ddply(example,.(shape,property="color"), function(example) summary(example$value))
但是这会返回一个数据框,其中包含所有其他唯一value
的列(来自属性size
和intensity
,我不想要):
shape property 3 5 6 7 blue green red
1 circle color 1 1 0 0 2 0 0
2 square NA 1 0 0 1 1 2 0
3 triangle NA 0 0 1 1 1 1 1
我做错了什么 - 是否有办法返回数据框,就像我展示的第一个结果一样?
此外,虽然这是一个小而快的例子,但我的“真实”数据要大得多,需要很长时间才能计算出来。仅通过限制property="color"
来改善ddply的速度吗?
编辑:感谢您的回答!不幸的是,我过度简化了情况,我不确定dcast
解决方案是否对我有用。让我解释一下 - 我实际上正在使用数据框example2
:
example2 <- structure(list(factory = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("A",
"B"), class = "factor"), shape = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L), .Label = c("circle",
"square", "triangle"), class = "factor"), property = structure(c(1L,
3L, 2L, 1L, 2L, 3L, 1L, 1L, 1L, 1L, 2L, 3L, 1L, 1L, 1L, 3L, 2L
), .Label = c("color", "intensity", "size"), class = "factor"),
value = structure(c(5L, 2L, 1L, 5L, 4L, 1L, 5L, 6L, 6L, 7L,
4L, 3L, 6L, 5L, 5L, 2L, 1L), .Label = c("3", "5", "6", "7",
"blue", "green", "red"), class = "factor")), .Names = c("factory",
"shape", "property", "value"), class = "data.frame", row.names = c(NA,
-17L))
我试图按factory
和shape
进行拆分。我使用ddply
:
ColorSummary2 <- ddply(example2,.(factory,shape,property="color"), function(example2) summary(example2$value))
给出了
factory shape property 3 5 6 7 blue green red
1 A circle color 1 1 0 0 2 0 0
2 A square NA 1 0 0 1 1 2 0
3 A triangle NA 0 0 1 1 1 1 1
4 B circle NA 1 1 0 0 1 0 0
但是我想要返回的是(对于凌乱的表格,我在这里格式化表格时遇到了麻烦):
factory shape property blue green red
1 A circle color 2 0 0
2 A square NA 1 2 0
3 A triangle NA 1 1 1
4 B circle NA 1 0 0
这可能吗?
编辑2:对于所有的编辑都抱歉,我过分简化了我的情况。这是一个更复杂的数据框,更接近我的真实例子。这个列有一个state
列,我不想用它来进行拆分。我可以用ddply做这个(乱七八糟),但是我可以使用dcast忽略state
吗?
example3 <- structure(list(state = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L), .Label = c("CA", "FL"
), class = "factor"), factory = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("A",
"B"), class = "factor"), shape = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L), .Label = c("circle",
"square", "triangle"), class = "factor"), property = structure(c(1L,
3L, 2L, 1L, 2L, 3L, 1L, 1L, 1L, 1L, 2L, 3L, 1L, 1L, 1L, 3L, 2L
), .Label = c("color", "intensity", "size"), class = "factor"),
value = structure(c(5L, 2L, 1L, 5L, 4L, 1L, 5L, 6L, 6L, 7L,
4L, 3L, 6L, 5L, 5L, 2L, 1L), .Label = c("3", "5", "6", "7",
"blue", "green", "red"), class = "factor")), .Names = c("state",
"factory", "shape", "property", "value"), class = "data.frame", row.names = c(NA,
-17L))
答案 0 :(得分:4)
使用dcast
中的reshape2
:
dcast(...~value,data=subset(example,property=='color'))
Aggregation function missing: defaulting to length
shape property blue green red
1 circle color 2 0 0
2 square color 1 2 0
3 triangle color 1 1 1
使用第二个数据集示例:
dcast(...~value,data=subset(example2,property=='color'))
Aggregation function missing: defaulting to length
factory shape property blue green red
1 A circle color 2 0 0
2 A square color 1 2 0
3 A triangle color 1 1 1
4 B circle color 1 0 0