我有五个数据帧:t1,t2,t3,t4,t5。所有数据框(它们具有相同的结构,只有一些值不同)具有一个具有相同数量属性的变量“country”。
基本上我想得到一些变量N: 对于每个国家/地区和dataset =>一个变量。
我的代码现在看起来像这样,但它非常繁琐冗长:
t1.COUNTRY1 <- subset(t1, SA0100="COUNTRY1")
t2.COUNTRY1 <- subset(t2, SA0100="COUNTRY1")
t3.COUNTRY1 <- subset(t3, SA0100="COUNTRY1")
t4.COUNTRY1 <- subset(t4, SA0100="COUNTRY1")
t5.COUNTRY1 <- subset(t5, SA0100="COUNTRY1")
t1.COUNTRY2 <- subset(t1, SA0100="COUNTRY2")
t2.COUNTRY2 <- subset(t2, SA0100="COUNTRY2")
...
数据集t1 ,其他人看起来相同
SA0100 DA1000 DA2100 RA0300
1 COUNTRY1 40000 45666 45
2 COUNTRY1 25456 78888 36
3 COUNTRY1 45666 12547 18
4 COUNTRY1 41255 58796 23
5 COUNTRY1 78992 32589 28
6 COUNTRY2 12558 25556 22
7 COUNTRY2 96542 65478 78
我试过使用一个循环,但是我没有设法得到任何东西,在这种特殊情况下我没有看到如何使用lapply()函数。
你能帮助我吗?
答案 0 :(得分:0)
此脚本使用循环并将列表中的国家设置为国家/地区,假设国家1出现在t1,国家2出现在t2等。如果国家/地区也出现在其他数据集中(例如,数据集2中的国家/地区1),那么应该更改脚本的最后一行(在t1,t2等中更改tcp)。
a=5 # number of iterations, datasets t1:t5
tch<-paste0(rep("t",each=a), c(1:a))
cch<-paste0(rep("Country",each=a), c(1:a))
country<-list()
for (i in 1:a)
{tcp<-get(tch[i])
country[[i]] <- (subset(tcp, SAO100==cch[i]))}
答案 1 :(得分:0)
假设你想要创建objects
(我希望将它放在一个列表而不是包含大量对象),你可以这样做:
list2env(unlist(lapply(mget(ls(pattern="t\\d+")),
function(x) split(x, x$SA0100)), recursive=FALSE),
envir=.GlobalEnv)
t1.COUNTRY1
# SA0100 DA1000 DA2100 RA0300
#1 COUNTRY1 40000 45666 45
#2 COUNTRY1 25456 78888 36
#3 COUNTRY1 45666 12547 18
#4 COUNTRY1 41255 58796 23
#5 COUNTRY1 78992 32589 28
t3.COUNTRY2
# SA0100 DA1000 DA2100 RA0300
#1 COUNTRY2 12558 25556 22
#4 COUNTRY2 12558 25556 22
t1 <- structure(list(SA0100 = c("COUNTRY1", "COUNTRY1", "COUNTRY1",
"COUNTRY1", "COUNTRY1", "COUNTRY2", "COUNTRY2"), DA1000 = c(40000L,
25456L, 45666L, 41255L, 78992L, 12558L, 96542L), DA2100 = c(45666L,
78888L, 12547L, 58796L, 32589L, 25556L, 65478L), RA0300 = c(45L,
36L, 18L, 23L, 28L, 22L, 78L)), .Names = c("SA0100", "DA1000",
"DA2100", "RA0300"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7"))
t2 <- structure(list(SA0100 = c("COUNTRY2", "COUNTRY2", "COUNTRY1"),
DA1000 = c(96542L, 96542L, 45666L), DA2100 = c(65478L, 65478L,
12547L), RA0300 = c(78L, 78L, 18L)), .Names = c("SA0100",
"DA1000", "DA2100", "RA0300"), row.names = c(NA, 3L), class = "data.frame")
t3 <- structure(list(SA0100 = c("COUNTRY2", "COUNTRY1", "COUNTRY1",
"COUNTRY2", "COUNTRY1"), DA1000 = c(12558L, 78992L, 41255L, 12558L,
40000L), DA2100 = c(25556L, 32589L, 58796L, 25556L, 45666L),
RA0300 = c(22L, 28L, 23L, 22L, 45L)), .Names = c("SA0100",
"DA1000", "DA2100", "RA0300"), row.names = c(NA, 5L), class = "data.frame")