在数据框中使用符号扩展函数

时间:2018-10-29 09:02:11

标签: r tidyr

我有一个非常简单的数据集,其中包含两个变量。

<table name="user" idMethod="native" phpName="User">
    <column name="id" phpName="Id" type="INTEGER" size="12" primaryKey="true" autoIncrement="true" required="true"/>
    <behavior name="delegate">
        <parameter name="to" value="user_profile" />
    </behavior>
    <vendor type="mysql">
        <parameter name="Engine" value="InnoDB"/>
    </vendor>
</table>
<table name="user_profile" idMethod="native" phpName="UserProfile">
    <column name="avatar" phpName="Avatar" type="VARCHAR" size="255"/>
    <column name="birthday" phpName="Birthday" type="DATE" required="true" defaultValue="0000-00-00"/>
    <vendor type="mysql">
        <parameter name="Engine" value="InnoDB"/>
    </vendor>
</table>

我想在data <- data.frame( ID = c("A","A","B","C","D","D"), Service = c("Shop","Online","Shop","Online","Online","Shop")) 上使用spread()函数,而不是Service的{​​{1}},我希望结果表包含一个“ Y”符号来表示spread经营该特定服务。例如:

ID

但是,ID函数不能仅仅通过提供一个键就起作用,那么有没有办法我可以使用现有的 ID Shop Online A Y Y B Y - C - Y D Y Y 函数来做到这一点,或者我必须使用其他方法吗? / p>

2 个答案:

答案 0 :(得分:2)

您需要先创建一个新列

library(tidyr)
library(dplyr)
data %>% 
  mutate(spread_col = "Y") %>% 
  spread(Service, spread_col, fill = "-")
#  ID Online Shop
#1  A      Y    Y
#2  B      -    Y
#3  C      Y    -
#4  D      Y    Y

(这可能是How to reshape data from long to wide format?的副本)


您还可以使用dcastdata.table中的reshape2

reshape2::dcast(
  data,
  ID ~ Service,
  fun.aggregate = function(x) replace(x, x == x, "Y"),
  fill = "-"
)

答案 1 :(得分:0)

您可以使用base进行此操作。

数据:(使用因子变量)

data <- data.frame(
    ID = c("A","A","B","C","D","D"),
    Service = factor(c("Shop","Online","Shop","Online","Online","Shop")), levels = c("Online","Shop"))

代码:

ans<-
do.call(
    rbind, tapply(data$Service, data$ID, table)
)

ans[ans == 1] = "Y"
ans[ans == 0] = "-"

结果:

#> ans
#  Online Shop
#A "Y"    "Y" 
#B "-"    "Y" 
#C "Y"    "-" 
#D "Y"    "Y" 
#>