r - 根据所选变量计算一个类别的频率

时间:2016-04-15 17:04:48

标签: r frequency percentage

我有一个名为 stats 的数据框,其中包含两列性别 Transportation.used ,如下所示:

Gender    Transportation.used
Male      Bus
Male      Car
Female    Car
Male      Car
Male      Motorcycle
Female    Bus

并且列表继续..(或查看此处:http://i.stack.imgur.com/GROIi.jpg

data_stats <- read.table(text="Gender   Transportation.used
Male    Bus
Male    Car
Female  Car
Male    Car
Male    Motorcycle
Female  Bus
Female  Bus
Female  Bus
Female  Bus
Male    Car
Female  Car
",header=T)

我想做的是根据选定的传输计算性别频率。稍后我将需要数据来绘制百分比条形图。期望的输出如下:

          Male    Female
   Bus    1        4

那么我如何计算以获取数据?我仍然是使用R的初学者,请帮帮我。提前谢谢!

2 个答案:

答案 0 :(得分:2)

尝试使用频率

table(stats)

或者,对于相对频率,

prop.table(table(stats))

或者,甚至更好(例如),

 xtabs(male ~ car, data = stats)

我添加了几个例子:

dt  <- data.frame(gender = rep(c("Male", "Female"), c(4, 2) ), trans = rep(c("Car", "Bus", "Bike"), c(3, 2, 1) ))

table(dt)
        trans
gender   Bike Bus Car
Female    1   1   0
Male      0   1   3

在任何情况下,根据您的问题编写的数据,我们正在处理因素。如果你想要更多的表选项,你应该进行几次类转换。

修改

这里是您在评论中发布的问题的答案。通过调整dt$colname的参数,您可以更好地控制最终输出。

table(dt$gender[dt$trans=="Car"])

Female   Male 
     0      3 

答案 1 :(得分:0)

您可以使用table

我们重新创建您的data.frame。请注意,最好提供reproducible example

df <- read.table(text="
Gender    Transportation.used
Male      Bus
Male      Car
Female    Car
Male      Car
Male      Motorcycle
Female    Bus", header=T)

然后你可以使用表:

table(df$Transportation.used, df$Gender) # here we type `df` twice
with(df, table(Transportation.used, Gender)) # `with` avoids that

在这种只有两列的特殊情况下,table(df)也可以工作并产生所需的输出(虽然是转置的)。

如果您确实希望Male成为table的第一列,则可以更改因子Gender的级别顺序(默认情况下按字母顺序排列)

levels(df$Gender) # Female comes (alphabetically) before Male
df$Gender <- factor(df$Gender, levels=rev(levels(df$Gender))) # we rearrange Gender levels order

现在with(df, table(Transportation.used, Gender))是你想要的输出。

Gender
Transportation.used Male Female
Bus           1      1
Car           2      1
Motorcycle    1      0

您可以从中获得最基本的图表(但请参阅?barplot):

tab <- with(df, table(Transportation.used, Gender))
barplot(tab)

(编辑)

然后,如果您想要一个具有单一trasnport模式的表,您可以:

with(df, table(Transportation.used, Gender))["Bus",, drop=FALSE ]
                    Gender
Transportation.used Female Male
                Bus      1    1