Question

我在调查中分析问题，其中每个响应已被分配到3个集群中的1个。示例数据是：

library(tidyverse)

Do.You.Live.in.the.USA <- as.factor(c("Yes", "No", "Yes", "No", "Yes", "No", "Yes", "No", "Yes"))
Whats.your.favorite.color <- as.factor(c("Red", "Blue", "Green", "Red", "Blue", "Green", "Red", "Green", "Blue"))
Cluster <- c(1,2,3,3,2,1,3,2,1)

survey_data <- data.frame(Do.You.Live.in.the.USA, Whats.your.favorite.color, Cluster)
survey_data[] <- lapply(survey_data, factor)

调查回复已被子集化为三个数据框，每个数据框代表一个集群：

cluster_1_df <- survey_data %>%
  filter(Cluster=="1") %>% 
  select(-Cluster)
cluster_2_df <- survey_data %>%
  filter(Cluster=="2") %>% 
  select(-Cluster)
cluster_3_df <- survey_data %>%
  filter(Cluster=="3") %>% 
  select(-Cluster)

我想为每个群集创建一个摘要，然后将它们合并到一个矩阵中，以便稍后进行可视化。类似的东西：

cluster_1  <- summary(cluster_1_df$Do.You.Live.in.the.USA)
cluster_2  <- summary(cluster_2_df$Do.You.Live.in.the.USA)
cluster_3  <- summary(cluster_3_df$Do.You.Live.in.the.USA)
US_live_summary <- cbind(cluster_1, cluster_2, cluster_3)

分析许多调查问题这将变得费力，因此我想使用一个功能，所以我可以分析很多问题，但我遇到了一个问题：

clust_sum_fun <- function(x){
  cbind(summary(cluster_1_df$x), summary(cluster_2_df$x),  summary(cluster_3_df$x))
}

US_live_summary <- clust_sum_fun(Do.You.Live.in.the.USA)

...返回空白值。

我怀疑它在函数中使用字符串作为变量。有人可以建议一个解决方案吗？

Answer 1

这是直接的方法：

clust_sum_fun <- function(x)
  cbind(summary(cluster_1_df[, x]), summary(cluster_2_df[, x]), summary(cluster_3_df[, x]))  
(US_live_summary <- clust_sum_fun("Do.You.Live.in.the.USA"))
#     [,1] [,2] [,3]
# No     1    2    1
# Yes    2    1    2

一个问题是，通过编写Do.You.Live.in.the.USA，您实际上传递的不是名称而是传递变量Do.You.Live.in.the.USA（确实已定义，因此没有错误）。另一个问题是使用$x，可以使用[, x]子集来修复，其中x现在确实是一个字符。

如何在函数中使用字符串作为变量名？

1 个答案: