R:选择不同的原始和分离到新的排名集

时间:2015-05-15 03:45:57

标签: r rank

我有一个看起来像的数据集:

Set         Interest    Age     Gender  Scored.Probabilities    rank
1           AL008       18-24   male    0.211                    1
1           AL008       35-44   female  0.102                    2
1           AL008       25-34   female  0.002                    3
2           AL024       13-17   male    0.102                    1
2           AL024       35-44   female  0.051                    2
2           AL024       55-64   male    0.025                    3
2           AL024       25-34   male    0.022                    4
2           AL024       35-44   male    0.016                    5
3           AL034       45-54   male    0.021                    1
4           AL035       35-44   female  0.027                    1
5           AL036       35-44   male    0.082                    1

我想选择相同的名称'兴趣'列并创建根据“Scored.Probabilities'

”排名的新数据集
<% int Steps=1;%>
<jsp:include page="/form/Page1.jsp" flush="true">
    <jsp:param name="Steps" value="<%= Steps %>">
</jsp:include>

2 个答案:

答案 0 :(得分:2)

你可以尝试

 library(data.table)
 setDT(df1)[order(-Scored.Probabilities), rank:= 1:.N, Interest][
           order(Interest), Set := .GRP, Interest][order(Interest, rank)]
 #     Interest   Age Gender Scored.Probabilities rank Set
 #1:    AL008 18-24   male                0.211    1   1
 #2:    AL008 35-44 female                0.102    2   1
 #3:    AL008 25-34 female                0.002    3   1
 #4:    AL024 13-17   male                0.102    1   2
 #5:    AL024 35-44 female                0.051    2   2
 #6:    AL024 55-64   male                0.025    3   2
 #7:    AL024 25-34   male                0.022    4   2
 #8:    AL024 35-44   male                0.016    5   2
 #9:    AL034 45-54   male                0.021    1   3
#10:    AL035 35-44 female                0.027    1   4
#11:    AL036 35-44   male                0.082    1   5

答案 1 :(得分:0)

尝试使用dplyr

library("dplyr")
df <- read.table(text = "Interest    Age     Gender  Scored.Probabilities
AL008       18-24   male    0.211
AL024       25-34   male    0.022
AL008       35-44   female  0.102
AL008       25-34   female  0.002
AL024       13-17   male    0.102
AL035       35-44   female  0.027
AL024       35-44   female  0.051
AL024       55-64   male    0.025
AL024       35-44   male    0.016
AL034       45-54   male    0.021
AL036       35-44   male    0.082" , header = T)

df %>%
  arrange(Interest , desc(Scored.Probabilities)) %>%
  group_by(Interest) %>%
  mutate(rank = row_number())