迭代数据帧中的所有变量

时间:2015-11-15 15:04:06

标签: r dataframe missing-data

我找到了一种有用的平均插补技术here

更具体地说:

variable[is.na(variable)] <- rowMeans(cbind(variable[which(is.na(variable))-1], 
                                          variable[which(is.na(variable))+1]))

在丢失的值之前和之后获取值并将其平均值归结。

但是,由于我有一个包含大量变量的大型数据框,我想知道有没有办法在df中的每个变量(列)上迭代这个函数?

dput:

dput(head(politbar_timeseries,10))
structure(list(Month = structure(c(8401, 8432, 8460, 8491, 8521, 
8552, 8582, 8613, 8644, 8674), class = "Date"), Intention_CDU = c(246L, 
223L, 222L, 232L, 261L, 240L, 241L, NA, 234L, 211L), Intention_SPD = c(304L, 
323L, 276L, 274L, 238L, 290L, 291L, NA, 284L, 296L), Intention_FDP = c(47L, 
44L, 46L, 36L, 35L, 50L, 31L, NA, 33L, 40L), Intention_Green = c(112L, 
90L, 108L, 97L, 92L, 93L, 80L, NA, 131L, 97L), Intention_PDS = c(1L, 
2L, 1L, 4L, 2L, 4L, 6L, NA, 3L, 1L), Intention_Right = c(40L, 
45L, 51L, 44L, 48L, 26L, 30L, NA, 33L, 39L), CDU_CSU_Scale = c(5.53364976051333, 
5.41668954145634, 5.41361737597252, 5.53237142973321, 5.90556125077522, 
5.65325991093138, 5.66581907651607, NA, 5.7568395653053, 5.56722081960557
), SPD_Scale = c(6.68501038883942, 7.0740019675866, 6.31415136355633, 
6.52447895467401, 6.29176231355408, 6.52870415235848, 6.73302006301497, 
NA, 7.12547563426403, 7.17833309669175), FDP_Scale = c(5.34570000100596, 
5.73343004031828, 5.52174547729524, 5.39618098094715, 5.81980921102384, 
5.64326882828348, 5.70136552543044, NA, 5.3836387964029, 5.73726720856055
), Grüne_Scale = c(5.73191750379599, 6.03715643205545, 6.19893648691653, 
5.96106479727169, 5.78436018957346, 5.54482751153172, 5.6213169156508, 
NA, 6.42776109093573, 6.33016932291559), Republikaner_Scale = c(2.33415238404679, 
2.40200426439232, 2.50591428720572, 2.45599753445912, 2.61170073660812, 
2.26120872300811, 2.24409536048212, NA, 2.29699201198203, 2.25876734042663
), PDS_Scale = c(NaN, NaN, NaN, NaN, NaN, NaN, NaN, NA, NaN, 
NaN)), .Names = c("Month", "Intention_CDU", "Intention_SPD", 
"Intention_FDP", "Intention_Green", "Intention_PDS", "Intention_Right", 
"CDU_CSU_Scale", "SPD_Scale", "FDP_Scale", "Grüne_Scale", "Republikaner_Scale", 
"PDS_Scale"), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 249L, 
8L, 9L), class = "data.frame")

0 个答案:

没有答案