我想将数据集从宽格式重整为长格式。
数据集包含300个东西变量,每个变量均按以下原则命名:ModelID_Emotion_ModelGender。以下示例数据:
structure(list(X71_Anger_Male = structure(c(3L, 1L, 2L), .Label = c("Anger",
"Disgust", "Fear"), class = "factor"), X71_Disgus_Male = structure(c(2L,
1L, 1L), .Label = c("Disgust", "Fear"), class = "factor")), class = "data.frame", row.names = c(NA,
-3L))
我想以一种方式处理数据,以使列名中的信息被获取并放入新变量中。例如,应该有一个新的变量ModelGender,新的变量modelID和新的变量情绪。因此数据集应如下所示:
structure(list(Gender = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Male", class = "factor"),
ModelNumber = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "X71", class = "factor"),
Emotion = structure(c(2L, 2L, 2L, 1L, 1L, 1L), .Label = c("Anger",
"Disgust"), class = "factor"), Response = structure(c(3L,
2L, 2L, 3L, 1L, 2L), .Label = c("Anger", "Disgust", "Fear"
), class = "factor")), class = "data.frame", row.names = c(NA,
-6L))
当我使用重塑形状或聚集/展开或熔化/浇铸时,它无法提供所需的结果。有谁知道如何做到这一点?
谢谢您的时间!
答案 0 :(得分:1)
您可以简单地转换为long并拆分所需的列。 tidyverse方法的一种方法可以是
library(dplyr)
library(tidyr)
df %>%
pivot_longer(everything()) %>%
separate(name, into = c('ModelNumber', 'Emotion', 'Gender'), sep = '_')
答案 1 :(得分:1)
在pivot_longer
中,您可以将names_sep
指定为"_"
并将列名分为3列。
tidyr::pivot_longer(df, cols = everything(),
names_to = c('ModelNumber', 'Emotion', 'Gender'),
values_to = 'Response',
names_sep = '_')
# A tibble: 6 x 4
# ModelNumber Emotion Gender Response
# <chr> <chr> <chr> <fct>
#1 X71 Anger Male Fear
#2 X71 Disgus Male Fear
#3 X71 Anger Male Anger
#4 X71 Disgus Male Disgust
#5 X71 Anger Male Disgust
#6 X71 Disgus Male Disgust