订单列表取决于矢量

时间:2016-02-10 18:11:21

标签: r

要创建一些图表,我已经使用以下方法汇总了我的数据,其中包括所有需要的信息。

# Load Data
RawDataSet <- read.csv("http://pastebin.com/raw/VP6cF31A", sep=";")
# Load packages
library(plyr)
library(dplyr)
library(tidyr)
library(ggplot2)
library(reshape2)

# summarising the data
new.df <- RawDataSet %>% 
  group_by(UserEmail,location,context) %>% 
  tally() %>%
  mutate(n2 = n * c(1,-1)[(location=="NOT_WITHIN")+1L]) %>%
  group_by(UserEmail,location) %>%
  mutate(p = c(1,-1)[(location=="NOT_WITHIN")+1L] * n/sum(n))

通过其他一些分析,我确定了不同的用户组。由于我想绘制我的数据,所以有一个图表以正确的顺序可视化我的数据会很棒。 该订单基于UserEmail,由以下内容定义:

order <- c("28","27","25","23","22","21","20","16","12","10","9","8","5","4","2","1","29","19","17","15","14","13","7","3","30","26","24","18","11","6")

询问new.df typeof(new.df)的{​​{1}}类型,表示这是list。我已经尝试了一些方法,比如order_by或with_order,但是直到现在我还没有根据我的new.df向量来管理它order。当然,订单处理也可以在摘要部分中完成。 有没有办法这样做?

1 个答案:

答案 0 :(得分:2)

我无法通过该名称创建一个名为order的向量,以尊重R函数。使用match构造索引作为基础order ing(作为函数):

sorted.df <- new.df[ order(match(new.df$UserEmail, as.integer(c("28","27","25","23","22","21","20","16","12","10","9","8","5","4","2","1","29","19","17","15","14","13","7","3","30","26","24","18","11","6")) )), ]
 head(sorted.df)
#---------------
Source: local data frame [6 x 6]
Groups: UserEmail, location [4]

  UserEmail   location   context     n    n2          p
      (int)     (fctr)    (fctr) (int) (dbl)      (dbl)
1        28 NOT_WITHIN Clicked A    16   -16 -0.8421053
2        28 NOT_WITHIN Clicked B     3    -3 -0.1578947
3        28     WITHIN Clicked A     2     2  1.0000000
4        27 NOT_WITHIN Clicked A     4    -4 -0.8000000
5        27 NOT_WITHIN Clicked B     1    -1 -0.2000000
6        27     WITHIN Clicked A     1     1  1.0000000

(我没有加载plyr或reshape2,因为这些软件包中至少有一个与dplyr函数有很差的交互习惯。)