我的admission_table
包含ADMIT
,GRE
,GPA
和RANK
。
> head(admission_table)
ADMIT GRE GPA RANK
1 0 380 3.61 3
2 1 660 3.67 3
3 1 800 4.00 1
4 1 640 3.19 4
5 0 520 2.93 4
6 1 760 3.00 2
我正在尝试将此表的摘要转换为data.frame
。我想将ADMIT
,GRE
,GPA
和RANK
作为我的列标题。
> summary(admission_table)
ADMIT GRE GPA RANK
Min. :0.0000 Min. :220.0 Min. :2.260 Min. :1.000
1st Qu.:0.0000 1st Qu.:520.0 1st Qu.:3.130 1st Qu.:2.000
Median :0.0000 Median :580.0 Median :3.395 Median :2.000
Mean :0.3175 Mean :587.7 Mean :3.390 Mean :2.485
3rd Qu.:1.0000 3rd Qu.:660.0 3rd Qu.:3.670 3rd Qu.:3.000
Max. :1.0000 Max. :800.0 Max. :4.000 Max. :4.000
> as.data.frame(summary(admission_table))
Var1 Var2 Freq
1 ADMIT Min. :0.0000
2 ADMIT 1st Qu.:0.0000
3 ADMIT Median :0.0000
4 ADMIT Mean :0.3175
5 ADMIT 3rd Qu.:1.0000
6 ADMIT Max. :1.0000
7 GRE Min. :220.0
8 GRE 1st Qu.:520.0
9 GRE Median :580.0
10 GRE Mean :587.7
11 GRE 3rd Qu.:660.0
12 GRE Max. :800.0
13 GPA Min. :2.260
14 GPA 1st Qu.:3.130
15 GPA Median :3.395
16 GPA Mean :3.390
17 GPA 3rd Qu.:3.670
18 GPA Max. :4.000
19 RANK Min. :1.000
20 RANK 1st Qu.:2.000
21 RANK Median :2.000
22 RANK Mean :2.485
23 RANK 3rd Qu.:3.000
24 RANK Max. :4.000
当我试图转换为data.frame
时,这是我得到的唯一结果。我希望数据框具有与摘要表一样的确切输出,因为在此之后我想使用以下代码行将其插入Oracle数据库:
dbWriteTable(connection,name="SUM_ADMISSION_TABLE",value=as.data.frame(summary(admission_table)),row.names = FALSE, overwrite = TRUE ,append = FALSE)
有没有办法这样做?
答案 0 :(得分:36)
我想你可以考虑unclass
:
data.frame(unclass(summary(mydf)), check.names = FALSE, stringsAsFactors = FALSE)
# ADMIT GRE GPA RANK
# 1 Min. :0.0000 Min. :380.0 Min. :2.930 Min. :1.000
# 2 1st Qu.:0.2500 1st Qu.:550.0 1st Qu.:3.047 1st Qu.:2.250
# 3 Median :1.0000 Median :650.0 Median :3.400 Median :3.000
# 4 Mean :0.6667 Mean :626.7 Mean :3.400 Mean :2.833
# 5 3rd Qu.:1.0000 3rd Qu.:735.0 3rd Qu.:3.655 3rd Qu.:3.750
# 6 Max. :1.0000 Max. :800.0 Max. :4.000 Max. :4.000
str(.Last.value)
# 'data.frame': 6 obs. of 4 variables:
# $ ADMIT: chr "Min. :0.0000 " "1st Qu.:0.2500 " "Median :1.0000 " "Mean :0.6667 " ...
# $ GRE : chr "Min. :380.0 " "1st Qu.:550.0 " "Median :650.0 " "Mean :626.7 " ...
# $ GPA : chr "Min. :2.930 " "1st Qu.:3.047 " "Median :3.400 " "Mean :3.400 " ...
# $ RANK: chr "Min. :1.000 " "1st Qu.:2.250 " "Median :3.000 " "Mean :2.833 " ...
请注意,名称和值都存在大量过多的空格。
但是,执行以下操作可能就足够了:
do.call(cbind, lapply(mydf, summary))
# ADMIT GRE GPA RANK
# Min. 0.0000 380.0 2.930 1.000
# 1st Qu. 0.2500 550.0 3.048 2.250
# Median 1.0000 650.0 3.400 3.000
# Mean 0.6667 626.7 3.400 2.833
# 3rd Qu. 1.0000 735.0 3.655 3.750
# Max. 1.0000 800.0 4.000 4.000
答案 1 :(得分:0)
输出数据帧的另一种方法是:
as.data.frame(apply(mydf, 2, summary))
仅在选择数字列时有效。
如果存在带有NA的列,则可能会抛出Error in dimnames(x)
。值得首先检查一下as.data.frame()
函数的作用。