Data1是一个网络,
the data is data1
http://stats.math.uni-augsburg.de/Mondrian/Data/Titanic.txt
当我得到data1时,我如何得到表(将其命名为data2),如下所示:
, , Age = Child, Survived = No
Sex
Class Male Female
1st 0 0
2nd 0 0
3rd 35 17
Crew 0 0
当我有数据2时,如下: ,,年龄=儿童,幸存=否
Sex
Class Male Female
1st 0 0
2nd 0 0
3rd 35 17
Crew 0 0
, , Age = Adult, Survived = No
Sex
Class Male Female
1st 118 4
2nd 154 13
3rd 387 89
Crew 670 3
, , Age = Child, Survived = Yes
Sex
Class Male Female
1st 5 1
2nd 11 13
3rd 13 14
Crew 0 0
, , Age = Adult, Survived = Yes
Sex
Class Male Female
1st 57 140
2nd 14 80
3rd 75 76
Crew 192 20
如何将data2转换为data1?
1.将data1转换为data2
我可以做一部分工作。
url <- 'http://stats.math.uni-augsburg.de/Mondrian/Data/Titanic.txt'
data <- read.table(url,T)
data[data$Age=="Child" & data$Survived =="No",][,c(1,3)]
2.将data2转换为data1
不知道该怎么做。
我不想从泰坦尼克号获得泰坦尼克号的子数据 如何从csv文件中获取泰坦尼克号表? 如何从Titanic表中获取csv文件?
当我将泰坦尼克号写入文件时,网络中的数据形式并不相同
http://stats.math.uni-augsburg.de/Mondrian/Data/Titanic.txt
我选择了我写的内容:
"","Class","Sex","Age","Survived","Freq"
"1","1st","Male","Child","No",0
"2","2nd","Male","Child","No",0
"3","3rd","Male","Child","No",35
"4","Crew","Male","Child","No",0
"5","1st","Female","Child","No",0
"6","2nd","Female","Child","No",0
"7","3rd","Female","Child","No",17
"8","Crew","Female","Child","No",0
"9","1st","Male","Adult","No",118
"10","2nd","Male","Adult","No",154
"11","3rd","Male","Adult","No",387
"12","Crew","Male","Adult","No",670
"13","1st","Female","Adult","No",4
"14","2nd","Female","Adult","No",13
"15","3rd","Female","Adult","No",89
"16","Crew","Female","Adult","No",3
"17","1st","Male","Child","Yes",5
"18","2nd","Male","Child","Yes",11
"19","3rd","Male","Child","Yes",13
"20","Crew","Male","Child","Yes",0
"21","1st","Female","Child","Yes",1
"22","2nd","Female","Child","Yes",13
"23","3rd","Female","Child","Yes",14
"24","Crew","Female","Child","Yes",0
"25","1st","Male","Adult","Yes",57
"26","2nd","Male","Adult","Yes",14
"27","3rd","Male","Adult","Yes",75
"28","Crew","Male","Adult","Yes",192
"29","1st","Female","Adult","Yes",140
"30","2nd","Female","Adult","Yes",80
"31","3rd","Female","Adult","Yes",76
"32","Crew","Female","Adult","Yes",20
数据不是我想要的。
答案 0 :(得分:1)
Titanic
是一个“表格”对象,因此您需要稍微探索它以了解您正在查看的内容:
> str(Titanic)
table [1:4, 1:2, 1:2, 1:2] 0 0 35 0 0 0 17 0 118 154 ...
- attr(*, "dimnames")=List of 4
..$ Class : chr [1:4] "1st" "2nd" "3rd" "Crew"
..$ Sex : chr [1:2] "Male" "Female"
..$ Age : chr [1:2] "Child" "Adult"
..$ Survived: chr [1:2] "No" "Yes"
> dim(Titanic)
[1] 4 2 2 2
> dimnames(Titanic)
$Class
[1] "1st" "2nd" "3rd" "Crew"
$Sex
[1] "Male" "Female"
$Age
[1] "Child" "Adult"
$Survived
[1] "No" "Yes"
使用这些dim
和dimnames
来提取所需表格的一部分:
> Titanic[,,'Child','No']
Sex
Class Male Female
1st 0 0
2nd 0 0
3rd 35 17
Crew 0 0
对于您从网上加载数据的数据,您只想将最后一行代码包装在table
中:
table(data[data$Age=="Child" & data$Survived =="No",][,c(1,3)])
答案 1 :(得分:0)
也许我误解了你的问题,但似乎你想知道如何指定多维表格中列出的内容的顺序。
如果是这种情况,请尝试此操作(第一行,然后是列,然后是第三维(年龄),然后是第四维(幸存)):
data2 <- table(data[c("Class", "Sex", "Age", "Survived")])
## table(data[c(1, 3, 2, 4)])
data2
# , , Age = Adult, Survived = No
#
# Sex
# Class Female Male
# Crew 3 670
# First 4 118
# Second 13 154
# Third 89 387
#
# <<SNIP>>
#
#
# , , Age = Child, Survived = Yes
#
# Sex
# Class Female Male
# Crew 0 0
# First 1 5
# Second 13 11
# Third 14 13
关于问题的第二部分,它听起来像“如何从列表数据中重新创建平面/矩形data.frame
。对于这个特定示例,您可以尝试类似:
X <- data.frame(data2)
X <- X[rep(rownames(X), X$Freq), -length(X)]
将重新创建的数据的summary
与原始数据的summary
进行比较:
summary(X)
# Class Sex Age Survived
# Crew :885 Female: 470 Adult:2092 No :1490
# First :325 Male :1731 Child: 109 Yes: 711
# Second:285
# Third :706
summary(data)
# Class Age Sex Survived
# Crew :885 Adult:2092 Female: 470 No :1490
# First :325 Child: 109 Male :1731 Yes: 711
# Second:285
# Third :706
然后,我在黑暗中拍摄,因为你的问题不是很清楚。遗憾!