来自Reshape2的R - dcast - 按值变量排序而不是变量名

时间:2017-08-23 08:16:45

标签: r reshape2 dcast

我有以下data.frame名为countries_tools。它由3列(日期时间列(过去13个月),名称列(包含国家/地区)和访问列(访问这些特定国家/地区的人)组成):

datetime            name           Visits
2016-07-01 00:00:00 China          5237
2016-07-01 00:00:00 Germany        1434
2016-07-01 00:00:00 United States  1530
2016-07-01 00:00:00 India          696
2016-07-01 00:00:00 Japan          569
...
2017-07-01 00:00:00 China          4484
2017-07-01 00:00:00 Germany        1593
2017-07-01 00:00:00 United States  1438
2017-07-01 00:00:00 India          1204
2017-07-01 00:00:00 Japan          538

请注意我在中间删除了其他11个月。另请注意,该名称始终是相同5个国家/地区的列表,这些国家/地区对应于上个分析月份(2017年7月)中访问次数较多的五个国家/地区。

在此消息的末尾,我有一个dput数据。

为了更好地了解几个月内访问的数据和发展情况,我从我的data.frame中做了dcast

countries_tools <- dcast(countries_tools, datetime ~ name, value.var="Visits")

但是,结果数据框按国家/地区名称(按字母顺序)对列进行排序:

> names(countries_tools)
[1] "datetime"      "China"         "Germany"       "India"         "Japan"         "United States"

Order is alphabetical, not by variable value

但是,我希望订单由价值变量(访问)完成,因此最佳订单应为:

  

datetime,中国,德国,美国,印度,日本

是否可以完成(最好不需要额外的步骤)?使用其他功能也是可能的。

数据

dput(countries_tools)
structure(list(datetime = structure(c(1467320400, 1467320400, 
1467320400, 1467320400, 1467320400, 1469998800, 1469998800, 1469998800, 
1469998800, 1469998800, 1472677200, 1472677200, 1472677200, 1472677200, 
1472677200, 1475269200, 1475269200, 1475269200, 1475269200, 1475269200, 
1477951200, 1477951200, 1477951200, 1477951200, 1477951200, 1480543200, 
1480543200, 1480543200, 1480543200, 1480543200, 1483221600, 1483221600, 
1483221600, 1483221600, 1483221600, 1485900000, 1485900000, 1485900000, 
1485900000, 1485900000, 1488319200, 1488319200, 1488319200, 1488319200, 
1488319200, 1490994000, 1490994000, 1490994000, 1490994000, 1490994000, 
1493586000, 1493586000, 1493586000, 1493586000, 1493586000, 1496264400, 
1496264400, 1496264400, 1496264400, 1496264400, 1498856400, 1498856400, 
1498856400, 1498856400, 1498856400), class = c("POSIXct", "POSIXt"
), tzone = "Europe/Moscow"), name = c("China", "Germany", "United States", 
"India", "Japan", "China", "Germany", "United States", "India", 
"Japan", "China", "Germany", "United States", "India", "Japan", 
"China", "Germany", "United States", "India", "Japan", "China", 
"Germany", "United States", "India", "Japan", "China", "Germany", 
"United States", "India", "Japan", "China", "Germany", "United States", 
"India", "Japan", "China", "Germany", "United States", "India", 
"Japan", "China", "Germany", "United States", "India", "Japan", 
"China", "Germany", "United States", "India", "Japan", "China", 
"Germany", "United States", "India", "Japan", "China", "Germany", 
"United States", "India", "Japan", "China", "Germany", "United States", 
"India", "Japan"), Visits = c(5237, 1434, 1530, 696, 569, 4422, 
1508, 1971, 672, 461, 3993, 1521, 1901, 2027, 517, 3656, 1764, 
1716, 993, 509, 5483, 3117, 2762, 1298, 594, 5548, 2804, 2365, 
1222, 551, 3747, 3083, 1917, 999, 496, 3903, 2136, 1751, 1229, 
611, 5638, 2721, 2074, 1569, 533, 4326, 1618, 1511, 1254, 458, 
4364, 2021, 1690, 1162, 462, 4462, 1572, 1517, 1068, 574, 4484, 
1593, 1438, 1204, 538)), .Names = c("datetime", "name", "Visits"
), row.names = c(NA, -65L), class = "data.frame")

1 个答案:

答案 0 :(得分:1)

您可以将“名称”转换为有序因子,说明您想要的级别顺序:

countries_tools$name <- ordered(countries_tools$name, levels = unique(countries_tools$name))

现在可行:

dcast(countries_tools, datetime ~ name, value.var="Visits")