Question

我的这个表中充满了字符和数字，并且只想拥有前3个频率，加上他们自己的变量。根据图像，我想有一个表的结果只包括AZ 520，然后是AE 488，然后是AU 399。

   Var1 Freq
1    AE  488
2    AR   12
3    AU  399
4    AW   56
5    AZ  520
6    BA    2
7    BB   84
8    BG  246
9    BH   85
10   BI    6




as.data.frame(table(training.data.raw$destinationcountry))

Answer 1

按照以下方式重新创建数据，假设列名为name和value：

training.data.raw <- data_frame(name  = c("IN", "IS", "IT", "JO", "JP",     "KZ", "MA", "MZ", "NG", "NO", "NZ", "PE", "PH", "PR", "RO", "RU", "SA", "SE", "SY", "TM", "TN", "TR", "UK", "US", "WS"),
                                value = c(999, 1, 1885, 1098, 2, 584, 858, 11, 10, 522, 193, 29, 2, 1, 1603, 353, 6, 2, 4, 33, 228, 3201, 852, 1363, 1));

您可以使用top_n包中的dplyr功能轻松获得所需的结果（帮助文件?top_n中的详细信息）：

library(dplyr);
top_3 <- top_n(x=training.data.raw, n=3);
top_3;

基于评论的编辑：如果你有字符因素而不是常规字符向量，你可以mutate首先使用字符：

training.data.characters <- mutate(training.data.raw, name = as.character(name));

# Now top_n() will take it
# Can also explicity state wt argument to tell it to sort by value
top_3 <- top_n(x=training.data.characters, n=3, wt=value);
top_3;

R中的顶部/最大值

1 个答案: