Question

我有一个数据表，其中有缺失的数据，两列应该用来替换那些看起来像这样的缺失数据：

QVariant v(list);
QVariantList vlist = v.toList();

我想为此使用第二个数据框

library(data.table)
Data = data.table(
  "H1" = c(NaN,4,NaN),
  "H2" = c(5,NaN,NaN),
  "H3" = c(7,NaN,NaN),
  "Group" = c(1,2,1),
  "Factor" = c(2,3,4)
)

    H1  H2  H3 Group Factor
1: NaN   5   7     1      2
2:   4 NaN NaN     2      3
3: NaN NaN NaN     1      4

数据框“组”中的“组”列可能有用，也可能没有用，因为它基本上是此处的行号。

我考虑写一些接近

的东西

Groups = data.table(
  "H1" = c(1,2,3),
  "H2" = c(4,5,6),
  "H3" = c(7,8,9),
  "Group" = c(1,2,3)
)
   H1 H2 H3 Group
1:  1  4  7     1
2:  2  5  8     2
3:  3  6  9     3

但是显然，“小时”是不确定的，但希望可以这样写：

Data%>%
  mutate_at(vars(matches("^H\\d+$")), ~ifelse(is.na(.),
                                              Groups[Group, Hour]*Factor, .))

预期结果：

as.numeric(substr(columnName, 2, nchar(columnName)))

我怎么得到那个 columnName ？

其他问题：当我在此命令中将 Hour 替换为2时，出于测试目的。整个列的Group都会被考虑，而不仅仅是当前行的Group值，而且我不明白为什么。

任何不涉及为我的每一列重复执行mutate命令但能完成工作的解决方案都值得赞赏！

This Question可能与我的问题有关，但是我无法使用该“ deparse（substitute（。））”命令。

Answer 1

这是mutate_at和map2的解决方案。

library(purrr)
library(dplyr)

# Define columns for use later
cols_x <- paste0("H", 1:3, ".x")
cols_y <- paste0("H", 1:3, ".y")

# Multiply hours by factor
df <- left_join(Data, Groups, by = "Group") %>%
  mutate_at(cols_y, ~ . * Factor) 

# Replace values if missing
df <- as.data.frame(map2(cols_x, cols_y, ~ ifelse(is.na(df[[.x]]), df[[.y]], df[[.x]]))) %>%
  setNames(gsub(".x", "", cols_x))

如何使用我当前在mutate_at中使用的列的名称？

1 个答案: