我的数据如下:
FOO,yyy,Unigene126925_All,Unigene137063_All,0.238087
,,Unigene126925_All,Unigene24551_All,0.374231
,,Unigene126925_All,Unigene31835_All,0.367897
BAR,xxx,Unigene126925_All,Unigene165366_All,0.247844
,,Unigene126925_All,Unigene111784_All,0.344493
使用以下代码阅读之后:
dt <- read.csv("http://dpaste.com/1612639/plain/",header=FALSE,fill=FALSE,na.strings = "")
# The <NA> coercion here is intentional.
它产生了这个结果:
> dt
V1 V2 V3 V4 V5
1 FOO yyy Unigene126925_All Unigene137063_All 0.238087
2 <NA> <NA> Unigene126925_All Unigene24551_All 0.374231
3 <NA> <NA> Unigene126925_All Unigene31835_All 0.367897
4 BAR xxx Unigene126925_All Unigene165366_All 0.247844
5 <NA> <NA> Unigene126925_All Unigene111784_All 0.344493
我想要做的是用前面的值替换<NA>
单元格,产生这个:
FOO yyy Unigene126925_All Unigene137063_All 0.238087
FOO yyy Unigene126925_All Unigene24551_All 0.374231
FOO yyy Unigene126925_All Unigene31835_All 0.367897
BAR xxx Unigene126925_All Unigene165366_All 0.247844
BAR xxx Unigene126925_All Unigene111784_All 0.344493
在上面的示例中,第二行有NA,它必须从前面包含值的行中获取V1和V2列的值。
我如何在R中实现这一目标?
答案 0 :(得分:4)
您可以使用na.locf
功能(包zoo
):
library(zoo)
dt$V1 <- na.locf(dt$V1)
dt$V2 <- na.locf(dt$V2)
或一次性拍摄:
dt <- na.locf(dt)
获得
> dt
V1 V2 V3 V4 V5
1 FOO yyy Unigene126925_All Unigene137063_All 0.238087
2 FOO yyy Unigene126925_All Unigene24551_All 0.374231
3 FOO yyy Unigene126925_All Unigene31835_All 0.367897
4 BAR xxx Unigene126925_All Unigene165366_All 0.247844
5 BAR xxx Unigene126925_All Unigene111784_All 0.344493