下面的csv来自更长的数据表,称之为temp
。我希望将temp.wide
与region_code
作为列和以region_code
(SAS,SSA,EUR,...)的垂直顺序转换为 scenario region_code region_name value
1: 2010 SAS South Asia 61.17716
2: 2010 SSA Africa south of the Sahara 62.08588
3: 2010 EUR Europe 63.76123
4: 2010 LAC Latin America and Caribbean 68.84806
5: 2010 FSU Former Soviet Union 59.04499
6: 2010 EAP East Asia and Pacific 64.00579
7: 2010 NAM North America 66.18235
8: 2010 MEN Middle East and North Africa 58.03167
9: SSP2-NoCC-REF SAS South Asia 57.29973
10: SSP2-NoCC-REF SSA Africa south of the Sahara 65.14987
11: SSP2-NoCC-REF EUR Europe 63.99204
12: SSP2-NoCC-REF LAC Latin America and Caribbean 68.21118
13: SSP2-NoCC-REF FSU Former Soviet Union 60.10807
14: SSP2-NoCC-REF EAP East Asia and Pacific 63.86103
15: SSP2-NoCC-REF NAM North America 65.97859
16: SSP2-NoCC-REF MEN Middle East and North Africa 58.98356
temp = setDT(structure(list(scenario = c("2010", "2010", "2010", "2010", "2010",
"2010", "2010", "2010", "SSP2-NoCC-REF", "SSP2-NoCC-REF", "SSP2-NoCC-REF",
"SSP2-NoCC-REF", "SSP2-NoCC-REF", "SSP2-NoCC-REF", "SSP2-NoCC-REF",
"SSP2-NoCC-REF"), region_code = c("SAS", "SSA", "EUR", "LAC",
"FSU", "EAP", "NAM", "MEN", "SAS", "SSA", "EUR", "LAC", "FSU",
"EAP", "NAM", "MEN"), region_name = c("South Asia", "Africa south of the Sahara",
"Europe", "Latin America and Caribbean", "Former Soviet Union",
"East Asia and Pacific", "North America", "Middle East and North Africa",
"South Asia", "Africa south of the Sahara", "Europe", "Latin America and Caribbean",
"Former Soviet Union", "East Asia and Pacific", "North America",
"Middle East and North Africa"), value = c(61.1771623260257,
62.0858809906661, 63.7612306428217, 68.84805628195, 59.0449875464304,
64.0057851485101, 66.182351351389, 58.0316719859857, 57.299725759211,
65.1498720847705, 63.9920412193261, 68.2111842947542, 60.1080745513644,
63.86103368494, 65.9785850777114, 58.9835574681585)), .Names = c("scenario",
"region_code", "region_name", "value"), row.names = c(NA, -16L
), class = "data.frame"))
列的顺序。我只是注意到dcast按字母顺序排列新列。
formula.wide <- "scenario ~ region_code"
temp.wide <- data.table::dcast(
data = temp,
formula = formula.wide,
value.var = "value")
scenario EAP EUR FSU LAC MEN NAM SAS SSA
1: 2010 64.00579 63.76123 59.04499 68.84806 58.03167 66.18235 61.17716 62.08588
2: SSP2-NoCC-REF 63.86103 63.99204 60.10807 68.21118 58.98356 65.97859 57.29973 65.14987
这是我使用的代码。
scenario, EAP, EUR, FSU, LAC, MEN, NAM, SAS, SSA
新列名称为temp
。
我可以从setcolorder
获取正确的订单,然后使用temp.wide
为(LessonSubject.StartDate<=? OR LessonSubject.StartDate IS NULL)
提供正确的列顺序。但我想知道是否有某种方法可以不按字母顺序排列新的列顺序。
此外,dcast的帮助文本说
正在强制转换的列的名称以相同的顺序生成 (由下划线分隔,_)来自每个中的(唯一)值 公式RHS中提到的色谱柱。
如果我理解正确,我认为它不会描述dcast实际上做了什么。但是我不明白括号中的短语(用下划线分隔,_)意味着什么。
答案 0 :(得分:3)
以region_code(SAS,SSA,EUR,...)的垂直顺序作为列的顺序
只需传递适当级别的因子:
dcast(temp, scenario ~ factor(region_code, levels=unique(region_code)))
scenario SAS SSA EUR LAC FSU EAP NAM MEN
1: 2010 61.17716 62.08588 63.76123 68.84806 59.04499 64.00579 66.18235 58.03167
2: SSP2-NoCC-REF 57.29973 65.14987 63.99204 68.21118 60.10807 63.86103 65.97859 58.98356
OP中引用的文件听起来对我不对;在z ~ x + y
中 - x的唯一值按照结果列名称的顺序出现在y的唯一值之前。