将不同长度的变量映射到一个数据帧

时间:2018-05-25 14:05:38

标签: r dataframe join

假设我必须关注数据

specialty <- c("Primary Care", "Internal Medicine Subspecialties" , 
 "Pediatric subspecialties","Surgical subspecialties", "Emergency 
  Medicine","All other specialties", "No Medical specialty")


 test <- c(23,43,67,77,54)

dfTEST <- data.frame(test)
dfTEST<- t(dfTEST)
colnames(dfTEST) <- c(1,2,4,5,7)

> dfTEST
      1  2  4  5  7
 test 23 43 67 77 54

请注意,我的dfTest有5个跳过数字的变量。我需要创建一个数据框,将这些colname数字(1,2,4,5,7)映射到专业。专业是7个字符串,与dfTest字符串协调。意思是dfTest 2 =“Internal Medicine Subspecialties”和dfTest 4 =“手术亚专科等等。下面是我想要实现的内容的片段,但我很难理解如何去做。我需要它灵活所以无论代码中的数字是什么,代码仍然可以工作。任何想法?谢谢!!

> dfTEST
          1                2           4  5  7
 test     23              43           67 77 54
added "primary care"   "internal" ... 

2 个答案:

答案 0 :(得分:2)

这可以解决您的问题。

library(dplyr)
specialty_lookup <- data.frame(specialty = c("Primary Care",
                         "Internal Medicine Subspecialties", 
                         "Pediatric subspecialties",
                         "Surgical subspecialties",
                         "Emergency Medicine",
                         "All other specialties",
                         "No Medical specialty"),
           test = 1:7, 
           stringsAsFactors = F)

data  <-  data.frame(code = c(23,43,67,77,54),
                  test = c(1,2,4,5,7))

data <- data %>% 
  left_join(specialty_lookup)

data_wide <- data %>% 
  select(-test) %>%
  t() %>% 
  data.frame()

colnames(data_wide) <- data$test
data_wide

但是你应该质疑自己这是否真的是你想要数据的格式。从我能看到的问题来看,以下格式就足够了:

library(dplyr)
specialty_lookup <- data.frame(specialty = c("Primary Care",
                         "Internal Medicine Subspecialties", 
                         "Pediatric subspecialties",
                         "Surgical subspecialties",
                         "Emergency Medicine",
                         "All other specialties",
                         "No Medical specialty"),
           test = 1:7, stringsAsFactors = F)

data  <-  data.frame(code = c(23,43,67,77,54),
                  test = c(1,2,4,5,7))

data <- data %>% 
  left_join(specialty_lookup)

data

答案 1 :(得分:1)

希望这会有所帮助:

# get the indexes of correspondent specialties
ids <- as.integer(colnames(dfTEST))
dfTEST<- as.data.frame(t(dfTEST))
dfTEST$added <- specialty[ids]
dfTEST<- t(dfTEST)

输出:

> dfTEST
      1              2                                  4                        
test  "23"           "43"                               "67"                     
added "Primary Care" "Internal Medicine Subspecialties" "Surgical subspecialties"
      5                                     7                     
test  "77"                                  "54"                  
added "Emergency \n               Medicine" "No Medical specialty"