我制作了以下xtable,其中有11个因素用于中美洲考古遗址的时期。我希望将某些期间合并为“ CL” =“ CL” +“ EC” +“ LC”另外“ F” =“ MF” +“ LF”
我还想重新编码距离,以表示(km)“ 0” =“ 0km”,“ 1” =“ 1-2 Km”,“ 2” =“ 3-4km”,“ 3 = 3 -5km“
我似乎只能更改名称,而不能将与之相关的数据保留在原始表中
理想情况下,它看起来像这样,但是时间周期和距离显示正确。它们按时间顺序按时间顺序排列,尽管此处只显示了其中的几个。
dput(bomxtab3)
structure(list(Period = structure(c(10L, 5L, 6L, 1L, NA, 8L,
7L, 3L, 9L, 2L, 4L, 10L, 5L, 6L, 1L, NA, 8L, 7L, 3L, 9L, 2L,
4L, 10L, 5L, 6L, 1L, NA, 8L, 7L, 3L, 9L, 2L, 4L, 10L, 5L, 6L,
1L, NA, 8L, 7L, 3L, 9L, 2L, 4L), .Label = c("EF", "MF", "LF",
"TF", "CL", "EC", "LC", "ET", "LT", "AZ"), class = "factor"),
Distance = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L), .Label = c("0", "1", "2", "3"), class = "factor"),
Population = c(242391L, 1774L, 1980L, 315L, 0L, 9898L, 1430L,
5355L, 5010L, 903L, 2420L, 83725L, 7953L, 3320L, 175L, 200L,
13514L, 2370L, 8943L, 15018L, 4909L, 17107L, 55994L, 5546L,
1105L, 0L, 0L, 16110L, 1405L, 4105L, 5905L, 335L, 21563L,
19636L, 4815L, 1670L, 0L, 0L, 12811L, 525L, 3950L, 8563L,
3845L, 8562L), `bomxtab2$Population/x` = c(0.603343903859653,
0.0883114297092792, 0.245201238390093, 0.642857142857143,
0, 0.189134962643074, 0.24956369982548, 0.239565159039055,
0.145234230055659, 0.0903722978382706, 0.0487392250060421,
0.208402821683352, 0.395908004778973, 0.411145510835913,
0.357142857142857, 1, 0.258230944146141, 0.413612565445026,
0.400080526103879, 0.435354823747681, 0.491293034427542,
0.344537984371224, 0.139376621049121, 0.276085225009956,
0.136842105263158, 0, 0, 0.307836355645577, 0.245200698080279,
0.183644253567754, 0.17117926716141, 0.0335268214571657,
0.434282606944333, 0.0488766534078746, 0.239695340501792,
0.206811145510836, 0, 0, 0.244797737565207, 0.0916230366492147,
0.176710061289312, 0.24823167903525, 0.384807846277022, 0.172440183678402
)), row.names = c(NA, -44L), class = "data.frame")
>
答案 0 :(得分:1)
如果我理解正确,那么您想重新编码您拥有的变量。有很多关于此的信息,例如here和here。
我不太确定对周期变量是否完全正确,所以这就是为什么我创建了一个新变量。我使用tidyverse
包。
library(tidyverse)
df <- df %>% mutate(Period1 = case_when(
Period %in% c("CL", "EC", "LC") ~ "CL",
Period %in% c("MF","LF") ~ "F",
TRUE ~ as.character(Period)) ,
Distance1 = recode(Distance,
`0` = "0km",
`1` = "1-2km",
`2` = "2-3km",
`3` = "3-5km",)
)
使用%>%创建一个管道(来自dplyr
)。如果将case用作if语句,则如果Xis为true,则执行Y。在这里,我检查向量中列Period中是否有值,如果为true,则将其重新编码。
随机抽样,结果为:
Period Distance Population bomxtab2$Population/x Period1 Distance1
1 ET 0 9898 0.18913496 ET 0km
2 EF 3 0 0.00000000 EF 3-5km
3 ET 2 16110 0.30783636 ET 2-3km
4 MF 0 903 0.09037230 F 0km
5 ET 3 12811 0.24479774 ET 3-5km
6 MF 2 335 0.03352682 F 2-3km
7 LC 2 1405 0.24520070 CL 2-3km
8 TF 3 8562 0.17244018 TF 3-5km
9 CL 0 1774 0.08831143 CL 0km
10 ET 1 13514 0.25823094 ET 1-2km