根据每个数据框的列值更改多个数据框的列

时间:2016-11-09 10:28:38

标签: r dataframe sapply

假设我有大约180个不同的数据帧,其列ID,x,it,st。现在我想更改st列,以便所有数据帧的新值 st-1 。因此,前9行中此数据帧的输出仍为NA,从第10行开始,它将为11.0。

例如,这个数据框称为control37868.So要更改的列是control37868 $ st

ID    x           it         st
2    1.464462e+12 20.17831   NA
3    1.464462e+12 20.15657   NA
4    1.464463e+12 20.13484   NA
5    1.464463e+12 20.11310   NA
6    1.464464e+12 20.09136   NA
7    1.464465e+12 20.06963   NA
8    1.464465e+12 20.04789   NA
9    1.464466e+12 20.02615   NA
10   1.464466e+12 20.00442 12.0
11   1.464467e+12 19.98268 12.0
12   1.464468e+12 19.96094 12.0
13   1.464468e+12 19.93921 12.0
14   1.464469e+12 19.95700 12.0
15   1.464469e+12 20.01383 12.0
16   1.464470e+12 19.96272 12.0
17   1.464471e+12 19.96149 12.0
18   1.464471e+12 20.01166 12.0
19   1.464472e+12 19.92711 12.0
20   1.464472e+12 19.90119 12.0
21   1.464473e+12 19.88064 12.0
22   1.464474e+12 19.86010 12.0

我已经创建了一个包含所有数据框的列表:

#get list with all dataframes
dflist <- list(incident11951, incident12720, incident13643, incident1379, incident14248, incident14968, incident15634, incident16439, incident17383, incident17734, incident17850, incident18009, incident18337, incident21888, incident22666, incident23269, incident23682, incident23870, incident24493, incident25116, incident25669, incident26222, incident26931, incident28226, incident28290, incident29070, incident29180, incident29484, incident29726, incident29969, incident30244, incident30691, incident30967, incident31376, incident31434, incident32608, incident33041, incident33668, incident35112, incident35254, incident35577, incident36125, incident36267, incident36592, incident36671, incident37244, incident37412, incident37724, incident37868, incident38161, incident39453, incident39786, incident40482, incident40487, incident40975, incident41013, incident41381, incident41701, incident41772, incident42226, incident42358, incident42613, incident43395, incident43476, incident44827, incident45053, incident45454, incident45605, incident45703, incident46637, incident47832, incident50133, incident52105, incident5585, incident56003, incident56862, incident58213, incident58960, incident617, incident6361, incident7122, incident8144, incident9027, incident9245, incident9262, incident9534, incident9875, control11951, control12720, control13643, control1379, control14248, control14968, control15634, control16439, control17383, control17734, control17850, control18009, control18337, control21888, control22666, control23269, control23682, control23870, control24493, control25116, control25669, control26222, control26931, control28226, control28290, control29070, control29180, control29484, control29726, control29969, control30244, control30691, control30967, control31376, control31434, control32608, control33041, control33668, control35112, control35254, control35577, control36125, control36267, control36592, control36671, control37244, control37412, control37724, control37868, control38161, control39453, control39786, control40482, control40487, control40975, control41013, control41381, control41701, control41772, control42226, control42358, control42613, control43395, control43476, control44827, control45053, control45454, control45605, control45703, control46637, control47832, control50133, control52105, control5585, control56003, control56862, control58213, control58960, control617, control6361, control7122, control8144, control9027, control9245, control9262, control9534, control9875)

这应该可以通过dflist上的sapply()函数来实现,但我不确定如何定义根据当前值更改列值的正确函数。

只需添加该列并将其命名为阈值也可以。因此我也尝试了

 Map(cbind, dflist, threshold = st-1) 

但是st-1的引用显然不够

1 个答案:

答案 0 :(得分:0)

一种方法:

dflist<- lapply(dflist, function(x){
          y <- x[!is.na(x$st),]; 
          y$st <- y$st-1; 
          x[!is.na(x$st),] <- y; 
          x})

如果您想更改单个数据帧,请尝试:

idx <- which(!is.na(df$st)) 
if (length(idx) > 0) df[idx, ]$st <- df[idx,]$st-1