按R

时间:2016-09-07 18:39:17

标签: r sorting

我有这样的数据集:

    term                occ    value 
Less Than 1 year        Yale     1
Less Than 1 year        MIT      3
1 Year                  Yale     2
2 Years                 Yale     3
2 Years                 Yale     8
2 years                 CMU      2
3 Years                 Yale     5
3 years                 NYU      2
Greater than 3 Years    NYU      5
Greater Than 3 Years    CALTEC   4
No Fixed Term           Yale     2
Other                   Bu       9

我想要一个表格显示按术语计算的记录数量。表格应按Term的顺序排列。

注意:“年”和“年”,“比”和“比”之间的差异。

输出如下:

term                count
Less Than 1 year      2
1 Year                1
2 Years               3
3 Years               2
Greater than 3 Years  2
No Fixed Term         1
Other                 1

2 个答案:

答案 0 :(得分:2)

如果您需要特殊订单,则需要指定因子中的级别顺序。您还需要在不考虑案例的情况下进行比较。这应该工作

# reproducible data
dd<-read.table(text="term,occ,value 
Less Than 1 year,Yale,1
Less Than 1 year,MIT,3
1 Year,Yale,2
2 Years,Yale,3
2 Years,Yale,8
2 years,CMU,2
3 Years,Yale,5
3 years,NYU,2
Greater than 3 Years,NYU,5
Greater Than 3 Years,CALTEC,4
No Fixed Term,Yale,2
Other,Bu,9", header=T, sep=",")

# specify custom order

termorder<-c("Less Than 1 year","1 Year","2 Years","3 Years",
    "Greater than 3 Years","No Fixed Term","Other")

#tabulate
tt <- table(factor(tolower(dd$term), levels=tolower(termorder), labels=termorder))

返回命名向量。如果你想要一个data.frame,你可以做

as.data.frame(tt)
#                 Var1 Freq
# 1     Less Than 1 year    2
# 2               1 Year    1
# 3              2 Years    3
# 4              3 Years    2
# 5 Greater than 3 Years    2
# 6        No Fixed Term    1
# 7                Other    1

答案 1 :(得分:1)

我们可以在将“字词”转换为所有tablelower案例后使用upper

as.data.frame(table(tolower(df1$term)))

如果我们需要自定义订单,那么在执行factor

之前,我们需要将其转换为levels并指定table

或者我们也可以用tolower

替换单词,而不是使用sub
v1 <- sub("Than", "than", sub("years", "Years", df1$term))
as.data.frame(table(factor(v1, levels = unique(v1))))
#                  Var1 Freq
#1     Less than 1 year    2
#2               1 Year    1
#3              2 Years    3
#4              3 Years    2
#5 Greater than 3 Years    2
#6        No Fixed Term    1
#7                Other    1