我有一个文件,其中点击流以csv格式存储。 数据如下所示:
Row 1. User1 - Click1
Row 2. User1 - Click2
Row 3. User1 - Click3
Row 4. User2 - Click1
Row 5. User3 - Click1
Row 6. User3 - Click2
等等
r中是否有一个函数可以为数据提供以下格式
Row 1. User1- Click1 - Click2 - Click3
Row 2. User2 - Click1
Row 3. User3 - Click1 - Click2
由于
答案 0 :(得分:1)
library(reshape2)
df <- data.frame(user = rep(LETTERS[1:3], each = 3), click = rep(1:3, times = 3))
dfmelt <- melt(df, id = "user")
dfcast <- dcast(dfmelt, user ~ variable + value)
这是玩具数据:
> df
user click
1 A 1
2 A 2
3 A 3
4 B 1
5 B 2
6 B 3
7 C 1
8 C 2
9 C 3
结果如下:
> dfcast
user click_1 click_2 click_3
1 A 1 2 3
2 B 1 2 3
3 C 1 2 3
您也可以在一行中执行此操作,但不会获得漂亮的列名称:
> dcast(df, user ~ click)
user 1 2 3
1 A 1 2 3
2 B 1 2 3
3 C 1 2 3
答案 1 :(得分:1)
这可以是一个选项
library(splitstackshape)
cSplit(setDT(df)[, toString(V4), by='V3'], 'V1', ',')
# V3 V1_1 V1_2 V1_3
#1: User1 -Click1 -Click2 -Click3
#2: User2 -Click1 NA NA
#3: User3 -Click1 -Click2 NA
数据强>
df = structure(list(V1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Row", class = "factor"),
V2 = c(1, 2, 3, 4, 5, 6), V3 = structure(c(1L, 1L, 1L, 2L,
3L, 3L), .Label = c("User1", "User2", "User3"), class = "factor"),
V4 = structure(c(1L, 2L, 3L, 1L, 1L, 2L), .Label = c("-Click1",
"-Click2", "-Click3"), class = "factor")), .Names = c("V1",
"V2", "V3", "V4"), class = "data.frame", row.names = c(NA, -6L
))
答案 2 :(得分:0)
拥有此数据框,使用reshape
函数:
user click
1 User1 -Click1
2 User1 -Click2
3 User1 -Click3
4 User2 -Click1
5 User3 -Click1
6 User3 -Click2
df$n <- df$click
reshape(df, idvar="user", timevar="click" ,direction="wide")
输出:
user n.-Click1 n.-Click2 n.-Click3
1 User1 -Click1 -Click2 -Click3
4 User2 -Click1 <NA> <NA>
5 User3 -Click1 -Click2 <NA>