重塑 - 如何通过两列重构宽数据

时间:2014-08-20 18:28:13

标签: r

我正在尝试重新设计一个跨两个变量(日期和状态)重复的数据集。

以下是数据的链接:https://www.dropbox.com/s/e1nzv76kl6nvrnv/State%20Recast.csv?dl=0

每个单元格内的值是点击次数。数据的结构使得在日期和状态列之后,列是通过付费,有机,直接和引荐网络流量产生的总点击次数,桌面点击次数,移动点击次数和平板电脑点击次数(所以16列)。 / p>

我希望数据的结构使得每个日期(1月12日)和每个州(阿拉巴马州,阿拉斯加州等)共有4行,包括Total,Desktop,Mobile和Tablet,以及每个类别,付费,有机,直接和推荐的行(每个日期和状态为16行)

这是我粗略的尝试,但我没有得到太多,因为我不确定如何指定两个timevar值

l <- reshape(state, 
         varying = c("Paid.Clicks...Total","Paid.Clicks...Desktop", "Paid.Clicks...Mobile","Paid.Clicks...Tablet", "Organic.Clicks...Total" , "Organic.Clicks...Desktop" , "Organic.Clicks...Mobile"  ,  "Organic.Clicks...Tablet", "Direct.Traffic...Total"  ,   "Direct.Traffic...Desktop"  ,"Direct.Traffic...Mobile"  ,  "Direct.Traffic...Tablet","Referral.Traffic...Total"  , "Referral.Traffic...Desktop","Referral.Traffic...Mobile"  ,"Referral.Traffic...Tablet" ), 
         v.names = "Clicks",
         timevar = c("Date", "State"),
         times = c("Paid.Clicks...Total","Paid.Clicks...Desktop", "Paid.Clicks...Mobile","Paid.Clicks...Tablet", "Organic.Clicks...Total" , "Organic.Clicks...Desktop" , "Organic.Clicks...Mobile"  ,  "Organic.Clicks...Tablet", "Direct.Traffic...Total"  ,   "Direct.Traffic...Desktop"  ,"Direct.Traffic...Mobile"  ,  "Direct.Traffic...Tablet","Referral.Traffic...Total"  , "Referral.Traffic...Desktop","Referral.Traffic...Mobile"  ,"Referral.Traffic...Tablet"  ), 
         direction = "long")

提前谢谢大家!

2 个答案:

答案 0 :(得分:0)

这似乎有效:

state$id <- with(state, interaction(Date, State))
v <- c("Paid.Clicks...Total","Paid.Clicks...Desktop", "Paid.Clicks...Mobile","Paid.Clicks...Tablet", "Organic.Clicks...Total" , "Organic.Clicks...Desktop" , "Organic.Clicks...Mobile"  ,  "Organic.Clicks...Tablet", "Direct.Traffic...Total"  ,   "Direct.Traffic...Desktop"  ,"Direct.Traffic...Mobile"  ,  "Direct.Traffic...Tablet","Referral.Traffic...Total"  , "Referral.Traffic...Desktop","Referral.Traffic...Mobile"  ,"Referral.Traffic...Tablet" )
l <- reshape(state, 
             varying = v,
             v.names = "Clicks",
             direction = "long")
l$time <- v[l$time]

结果:

> head(l, 10)
                                Date                State                          id                time Clicks
Jan-12.Alabama.1              Jan-12              Alabama              Jan-12.Alabama Paid.Clicks...Total    421
Jan-12.Alaska.1               Jan-12               Alaska               Jan-12.Alaska Paid.Clicks...Total     61
Jan-12.Arizona.1              Jan-12              Arizona              Jan-12.Arizona Paid.Clicks...Total    768
Jan-12.Arkansas.1             Jan-12             Arkansas             Jan-12.Arkansas Paid.Clicks...Total    205
Jan-12.California.1           Jan-12           California           Jan-12.California Paid.Clicks...Total   2471
Jan-12.Colorado.1             Jan-12             Colorado             Jan-12.Colorado Paid.Clicks...Total    463
Jan-12.Connecticut.1          Jan-12          Connecticut          Jan-12.Connecticut Paid.Clicks...Total    647
Jan-12.Delaware.1             Jan-12             Delaware             Jan-12.Delaware Paid.Clicks...Total    114
Jan-12.District of Columbia.1 Jan-12 District of Columbia Jan-12.District of Columbia Paid.Clicks...Total    202
Jan-12.Florida.1              Jan-12              Florida              Jan-12.Florida Paid.Clicks...Total   3839

答案 1 :(得分:0)

我最终做了我想要的事情......

Total <- state[, c("Date", "State", "Paid.Clicks...Total", "Organic.Clicks...Total", "Direct.Traffic...Total", "Referral.Traffic...Total") ]
Desktop <- state[, c("Date", "State", "Paid.Clicks...Desktop", "Organic.Clicks...Desktop", "Direct.Traffic...Desktop", "Referral.Traffic...Desktop") ]
Mobile <- state[, c("Date", "State", "Paid.Clicks...Mobile", "Organic.Clicks...Mobile", "Direct.Traffic...Mobile", "Referral.Traffic...Mobile") ]
Tablet <- state[, c("Date", "State", "Paid.Clicks...Tablet", "Organic.Clicks...Tablet", "Direct.Traffic...Tablet", "Referral.Traffic...Tablet") ]

Total.r <- melt(Total, .id=c(State, Date), variable.name = "Medium", value.name= "Clicks")
colnames(Total.r) <- c("State", "Date", "Source", "Clicks")
Total.r$Source <- factor(Total.r$Source, labels = c("Paid", "Organic", "Direct", "Referral" ) )
Total.r$Medium <- c("Total")

Desktop.r <- melt(Desktop, .id=c(State, Date), variable.name = "Medium", value.name= "Clicks")
colnames(Desktop.r) <- c("State", "Date", "Source", "Clicks")
Desktop.r$Source <- factor(Desktop.r$Source, labels = c("Paid", "Organic", "Direct", "Referral" ) )
Desktop.r$Medium <- c("Desktop")

Mobile.r <- melt(Mobile, .id=c(State, Date), variable.name = "Medium", value.name= "Clicks")
colnames(Mobile.r) <- c("State", "Date", "Source", "Clicks")
Mobile.r$Source <- factor(Mobile.r$Source, labels = c("Paid", "Organic", "Direct", "Referral" ) )
Mobile.r$Medium <- c("Mobile")


Tablet.r <- melt(Tablet, .id=c(State, Date), variable.name = "Medium", value.name= "Clicks")
colnames(Tablet.r) <- c("State", "Date", "Source", "Clicks")
Tablet.r$Source <- factor(Tablet.r$Source, labels = c("Paid", "Organic", "Direct", "Referral" ) )
Tablet.r$Medium <- c("Tablet")


recast <- rbind(Total.r, Desktop.r, Mobile.r, Tablet.r)

不是最简约的解决方案,但对于那些可能想知道我是如何构建数据的人来说,你可以去。