我的数据
Type1 Type2 Type3 Expected_Output
Red Orange Pink Pink
Green abc na abc
Blue na na Blue
white na Green Green
na Brown purple purple
na black na black
grey na na grey
如何实现以下预期输出:
答案 0 :(得分:1)
由相同数据类型的多列组成的data.frame表示数据可能从宽格式转换为长格式。因此,melt()
用于删除NA
和 join 以将新列添加到原始data.frame:
library(data.table)
DT[melt(DT[, rn := .I], id.vars = "rn", na.rm = TRUE)[
order(variable), .(New = last(value)), by = rn], on = .(rn)][, rn := NULL][]
Type1 Type2 Type3 Expected_Output New 1: Red Orange Pink Pink Pink 2: Green abc NA abc abc 3: Blue NA NA Blue Blue 4: white NA Green Green Green 5: NA Brown purple purple grey 6: NA black NA black purple 7: grey NA NA grey black
fread()
用于读取示例数据集。 na.strings
参数告诉fread()
将"na"
字符串转换为NA
:
library(data.table)
DT <- fread(
"Type1 Type2 Type3 Expected_Output
Red Orange Pink Pink
Green abc na abc
Blue na na Blue
white na Green Green
na Brown purple purple
na black na black
grey na na grey ",
na.strings = "na")
The OP has requested也应该在输出中显示完全NA
的行。这可以通过更改右连接中data.table
个对象的顺序来实现。在data.table
语法中,X[Y]
是一个右连接,它接受Y
的所有行。如果需要X
的所有行,则必须使用正确的联接Y[X]
library(data.table)
# new data with 8th row
DT <- fread(
"Type1 Type2 Type3 Expected_Output
Red Orange Pink Pink
Green abc na abc
Blue na na Blue
white na Green Green
na Brown purple purple
na black na black
grey na na grey
na na na na",
na.strings = "na")
melt(DT[, rn := .I], id.vars = "rn", na.rm = TRUE)[
order(variable), .(New = last(value)), by = rn][DT, on = .(rn)][, rn := NULL][]
New Type1 Type2 Type3 Expected_Output 1: Pink Red Orange Pink Pink 2: abc Green abc NA abc 3: Blue Blue NA NA Blue 4: Green white NA Green Green 5: purple NA Brown purple purple 6: black NA black NA black 7: grey grey NA NA grey 8: NA NA NA NA NA
答案 1 :(得分:0)
您可以通过last
来自dplyr
内的apply
来执行此操作。确保na
为NA
,以便na.omit
忽略它们。
library(dplyr)
df[df=="na"] <- NA #change "na" to NA
df$expected2 <-apply(df[,1:3],1,function(x) last(na.omit(x)))
Type1 Type2 Type3 Expected_Output expected2
1 Red Orange Pink Pink Pink
2 Green abc <NA> abc abc
3 Blue <NA> <NA> Blue Blue
4 white <NA> Green Green Green
5 <NA> Brown purple purple purple
6 <NA> black <NA> black black
7 grey <NA> <NA> grey grey
数据强>
df <- read.table(text="Type1 Type2 Type3 Expected_Output
Red Orange Pink Pink
Green abc na abc
Blue na na Blue
white na Green Green
na Brown purple purple
na black na black
grey na na grey ",header=TRUE,stringsAsFactors=FALSE)
答案 2 :(得分:0)