r function / loop将列和值添加到多个数据帧

时间:2014-08-07 13:20:07

标签: r function loops

我有8个数据框,我想添加一个名为“park”的列,然后用来自dataframe名称的最后四个字符的值填充此列。以下是我的八个数据框中的两个:

water_land_by_ownname_apis <- structure(list(OWNERNAME = c("Forest Service (USFS)", "Fish and Wildlife Service (FWS)", 
"State Department of Natural Resources", "Private Landowner", 
"National Park Service (NPS)", "Unknown", "Private Institution", 
"Native American Land"), WATER = c(696600, 9900, 1758600, 26100, 
112636800, 1586688300, 0, 11354400), LAND = c(258642900, 997200, 
41905800, 2536200, 165591900, 1075917600, 461700, 314052300)), class = "data.frame", .Names = c("OWNERNAME", 
"WATER", "LAND"), data_types = c("C", "F", "F"), row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8"))

water_land_by_ownname_indu <- structure(list(OWNERNAME = c("The Nature Conservancy (TNC)", 
"Other State Land", "Private Institution", "State Department of Transportation", 
"State Department of Natural Resources", "Unknown", "National Park Service (NPS)", 
"Private Landowner", "Joint Ownership", "Private Non-profit", 
"Land Trust"), WATER = c(24300, 1018800, 5282100, 0, 12600, 19192500, 
802800, 139500, 0, 0, 0), LAND = c(719100, 10045800, 12556800, 
900, 2018700, 1446426000, 42484500, 5769900, 38700, 852300, 70200
)), class = "data.frame", .Names = c("OWNERNAME", "WATER", "LAND"
), data_types = c("C", "F", "F"), row.names = c("1", "2", "3", 
"4", "5", "6", "7", "8", "9", "10", "11"))

看起来像这样......

> water_land_by_ownname_apis
                              OWNERNAME      WATER       LAND
1                 Forest Service (USFS)     696600  258642900
2       Fish and Wildlife Service (FWS)       9900     997200
3 State Department of Natural Resources    1758600   41905800
4                     Private Landowner      26100    2536200
5           National Park Service (NPS)  112636800  165591900
6                               Unknown 1586688300 1075917600
7                   Private Institution          0     461700
8                  Native American Land   11354400  314052300
> water_land_by_ownname_indu
                               OWNERNAME    WATER       LAND
1           The Nature Conservancy (TNC)    24300     719100
2                       Other State Land  1018800   10045800
3                    Private Institution  5282100   12556800
4     State Department of Transportation        0        900
5  State Department of Natural Resources    12600    2018700
6                                Unknown 19192500 1446426000
7            National Park Service (NPS)   802800   42484500
8                      Private Landowner   139500    5769900
9                        Joint Ownership        0      38700
10                    Private Non-profit        0     852300
11                            Land Trust        0      70200

对于每个数据帧,我想添加一列('park')并用数据框名称的最后四个字符填充此列。例如......

water_land_by_ownname_apis$park <- 'apis'
water_land_by_ownname_indu$park <- 'indu'

导致这个......

> water_land_by_ownname_apis
                              OWNERNAME      WATER       LAND park
1                 Forest Service (USFS)     696600  258642900 apis
2       Fish and Wildlife Service (FWS)       9900     997200 apis
3 State Department of Natural Resources    1758600   41905800 apis
4                     Private Landowner      26100    2536200 apis
5           National Park Service (NPS)  112636800  165591900 apis
6                               Unknown 1586688300 1075917600 apis
7                   Private Institution          0     461700 apis
8                  Native American Land   11354400  314052300 apis
> water_land_by_ownname_indu
                               OWNERNAME    WATER       LAND park
1           The Nature Conservancy (TNC)    24300     719100 indu
2                       Other State Land  1018800   10045800 indu
3                    Private Institution  5282100   12556800 indu
4     State Department of Transportation        0        900 indu
5  State Department of Natural Resources    12600    2018700 indu
6                                Unknown 19192500 1446426000 indu
7            National Park Service (NPS)   802800   42484500 indu
8                      Private Landowner   139500    5769900 indu
9                        Joint Ownership        0      38700 indu
10                    Private Non-profit        0     852300 indu
11                            Land Trust        0      70200 indu

然后,将它们绑在一起......

water_land_by_ownname <- rbind (water_land_by_ownname_apis, water_land_by_ownname_indu)

然后,从内存中删除先前的数据帧...

rm (water_land_by_ownname_apis,water_land_by_ownname_indu)

3 个答案:

答案 0 :(得分:6)

您可以这样做:

do.call(rbind,lapply(ls(pattern='water.*'),
       function(x) {
         dat=get(x)
         dat$park = sub('.*_(.*)$','\\1',x)
         dat
       }))
  1. ls将提取具有特定模式的所有data.frames名称,这里我假设您data.frame以单词water开头。这将是名单中的商店名称,方便lapply使用。
  2. sub将提取名称的最后一部分
  3. do.call + rbind已应用于结果列表以获取唯一的大数据。框架
  4. 使用您获得的2个data.frames:

                                  OWNERNAME      WATER       LAND park
    1                  Forest Service (USFS)     696600  258642900 apis
    2        Fish and Wildlife Service (FWS)       9900     997200 apis
    3  State Department of Natural Resources    1758600   41905800 apis
    4                      Private Landowner      26100    2536200 apis
    5            National Park Service (NPS)  112636800  165591900 apis
    6                                Unknown 1586688300 1075917600 apis
    7                    Private Institution          0     461700 apis
    8                   Native American Land   11354400  314052300 apis
    12          The Nature Conservancy (TNC)      24300     719100 indu
    21                      Other State Land    1018800   10045800 indu
    31                   Private Institution    5282100   12556800 indu
    41    State Department of Transportation          0        900 indu
    51 State Department of Natural Resources      12600    2018700 indu
    61                               Unknown   19192500 1446426000 indu
    71           National Park Service (NPS)     802800   42484500 indu
    

答案 1 :(得分:2)

我会将data.frames放在命名列表中并执行此任务:

rslt <- list(water_land_by_ownname_apis = water_land_by_ownname_apis, 
             water_land_by_ownname_indu = water_land_by_ownname_indu)

for (i in names(rslt)) {
  col <- unlist(strsplit(i, "_"))[5]
  rslt[[i]]$park <- col
}

do.call("rbind", rslt)

答案 2 :(得分:0)

使用Map和一些"[<-"欺骗的变体:

vars <- ls(pattern="water_.")
l <- mget(vars)
names(l) <- substr(vars,nchar(vars)-3,nchar(vars))
do.call(rbind,Map("[<-",l,TRUE,"park",names(l)))