Question

我正在看的数据结构是：

head(data)
  ID Gender      Location   Generation                           Question Response
1  2   Male South America Generation X Q0. Vote in the upcoming election?    0: No
2  2   Male South America Generation X                     Q1. Pulse Rate    0: No
3  2   Male South America Generation X                     Q2. Metabolism    0: No
4  2   Male South America Generation X                  Q3.Blood Pressure   1: Yes
5  2   Male South America Generation X                    Q4. Temperature    0: No
6  2   Male South America Generation X         Q5. Galvanic Skin Response   1: Yes

此数据框中的列标题如下：

> colnames(data)
[1] "ID"         "Gender"     "Location"   "Generation" "Question"   "Response"

标题中的Question包含所提问题，Responses也是如此。我想看看它是：

> colnames(final_data)
 [1] "ID"                                 "Gender"                            
 [3] "Location"                           "Generation"                        
 [5] "Q0. Vote in the upcoming election?" "Q1. Pulse Rate"                    
 [7] "Q134. Good Job Skills"              "Q135. Sense of Humor"              
 [9] "Q136. Intelligence"                 "Q137.Can Play Jazz"                
[11] "Q138.Likes the Beatles"             "Q139. Snobbiness"                  
[13] "Q140.Ability to lift heavy objects" "Q141.Grace under pressure"         
[15] "Q142.Grace on the dance floor"      "Q143.Likes animals"                
[17] "Q144.Makes good coffee"             "Q145.Eats all his/her vegetables"  
[19] "Q2. Metabolism"                     "Q3.Blood Pressure"                 
[21] "Q4. Temperature"                    "Q5. Galvanic Skin Response"        
[23] "Q6. Breathing"                      "Q7. Perspiration"                  
[25] "Q8.Pupil Dilation"                  "Q9. Adrenaline Production"

目前，我有一些数据可以记录单行中每个ID的属性。实质上，它意味着每一行只有一个唯一ID的属性。

我看到了另一个问题here但未能理解。有人可以帮忙吗？

Answer 1

我没有得到你最终如何看待数据。我的猜测是你希望数据有 ID，性别，位置，生成为前4列，然后将问题转换为列名称，并将其答案作为相应列下的值。要做到这一点，您只需使用melt包中的dcast和reshape2函数

x=melt(data,id=c("ID","Gender","Location","Generation"))
#this will melt the data frame telling R that these 4 variables are your primary keys
final_data=dcast(x, ID + Gender + Location + Generation ~ Question, value.var="Response")

我认为这会解决问题

转换R中的数据结构

1 个答案: