我正在看的数据结构是:
head(data)
ID Gender Location Generation Question Response
1 2 Male South America Generation X Q0. Vote in the upcoming election? 0: No
2 2 Male South America Generation X Q1. Pulse Rate 0: No
3 2 Male South America Generation X Q2. Metabolism 0: No
4 2 Male South America Generation X Q3.Blood Pressure 1: Yes
5 2 Male South America Generation X Q4. Temperature 0: No
6 2 Male South America Generation X Q5. Galvanic Skin Response 1: Yes
此数据框中的列标题如下:
> colnames(data)
[1] "ID" "Gender" "Location" "Generation" "Question" "Response"
标题中的Question
包含所提问题,Responses
也是如此。我想看看它是:
> colnames(final_data)
[1] "ID" "Gender"
[3] "Location" "Generation"
[5] "Q0. Vote in the upcoming election?" "Q1. Pulse Rate"
[7] "Q134. Good Job Skills" "Q135. Sense of Humor"
[9] "Q136. Intelligence" "Q137.Can Play Jazz"
[11] "Q138.Likes the Beatles" "Q139. Snobbiness"
[13] "Q140.Ability to lift heavy objects" "Q141.Grace under pressure"
[15] "Q142.Grace on the dance floor" "Q143.Likes animals"
[17] "Q144.Makes good coffee" "Q145.Eats all his/her vegetables"
[19] "Q2. Metabolism" "Q3.Blood Pressure"
[21] "Q4. Temperature" "Q5. Galvanic Skin Response"
[23] "Q6. Breathing" "Q7. Perspiration"
[25] "Q8.Pupil Dilation" "Q9. Adrenaline Production"
目前,我有一些数据可以记录单行中每个ID的属性。实质上,它意味着每一行只有一个唯一ID的属性。
我看到了另一个问题here但未能理解。有人可以帮忙吗?
答案 0 :(得分:0)
我没有得到你最终如何看待数据。
我的猜测是你希望数据有
ID,性别,位置,生成为前4列,然后将问题转换为列名称,并将其答案作为相应列下的值。
要做到这一点,您只需使用melt
包中的dcast
和reshape2
函数
x=melt(data,id=c("ID","Gender","Location","Generation"))
#this will melt the data frame telling R that these 4 variables are your primary keys
final_data=dcast(x, ID + Gender + Location + Generation ~ Question, value.var="Response")
我认为这会解决问题