连接避免NA产生多行变量

时间:2018-05-16 13:24:29

标签: r dataframe concatenation missing-data

假数据

Fruit <- c("Tomato", "Banana", "Kiwi", "Pear")
Colour <- c("Red", "Yellow", NA, "Green")
Weight <- c(10, 12, 6, 8)

dd <- data.frame(Fruit, Colour, Weight)

尝试失败

dd <- dd %>%
  mutate(Description = sprintf("%s: %s \n%s: %s \n%s: %s",
                               names(dd)[1], Fruit,
                               names(dd)[2], Colour,
                               names(dd)[3], Weight))

dd$Description[1]

所需的输出:多行&#34;描述&#34;变量忽略NAs。

&#34;说明&#34;番茄的变量:

Fruit: Tomato
Colour: Red
Weight: 10

&#34;说明&#34;新西兰的变量,NA被忽略了!

Fruit: Kiwi
Weight: 6

2 个答案:

答案 0 :(得分:1)

这感觉有点hackish,但对于基础R解决方案,我们可以使用ifelseis.na有条件地呈现属性,或者只是空字符串:

sprintf("%s\n%s\n%s",
    ifelse(is.na(Fruit), "", paste0(names(dd)[1], ": ", Fruit)),
    ifelse(is.na(Colour), "", paste0(names(dd)[2], ": ", Colour)),
    ifelse(is.na(Weight), "", paste0(names(dd)[3], ": ", Weight)))

[1] "Fruit: Tomato\nColour: Red\nWeight: 10"
[2] "Fruit: Banana\nColour: Yellow\nWeight: 12"
[3] "Fruit: Kiwi\n\nWeight: 6"                   <-- Kiwi has no colour
[4] "Fruit: Pear\nColour: Green\nWeight: 8"

Demo

答案 1 :(得分:1)

循环遍历行,删除NAs并粘贴:

dd$Description <- unlist(
  apply(dd, 1, function(i) {
    x <- na.omit(i)
    paste(paste0(names(x),":", x), collapse = "\n")
  }))

dd
#    Fruit Colour Weight                            Description
# 1 Tomato    Red     10    Fruit:Tomato\nColour:Red\nWeight:10
# 2 Banana Yellow     12 Fruit:Banana\nColour:Yellow\nWeight:12
# 3   Kiwi   <NA>      6                  Fruit:Kiwi\nWeight: 6
# 4   Pear  Green      8    Fruit:Pear\nColour:Green\nWeight: 8