这是设置:
mydf<-structure(list(weight = c(1.34288799762726, 1.18884372711182,
1.15979790687561, 1.34288799762726, 1.08285343647003, 1.07932889461517,
1.28913342952728, 1.211909532547, 1.03438591957092, 1.22719633579254
), RespID = c(3182, 3183, 3184, 3185, 3186, 3187, 3188, 3189,
3190, 3191), b1 = structure(c(1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L,
2L, 2L), .Label = c("Mand", "Kvinde"), class = "factor")), .Names = c("weight",
"RespID", "b1"), row.names = c(NA, 10L), class = "data.frame")
现在,对摘要的调用将生成以下输出:
summary(mydf)
# weight RespID b1
# Min. :1.034 Min. :3182 Mand :4
# 1st Qu.:1.102 1st Qu.:3184 Kvinde:6
# Median :1.200 Median :3186
# Mean :1.196 Mean :3186
# 3rd Qu.:1.274 3rd Qu.:3189
# Max. :1.343 Max. :3191
同时apply会给出另一个结果:
apply(mydf, 2, class)
# weight RespID b1
#"character" "character" "character"
所以根据应用我的data.frame中的每一列都是类“字符”,我知道这是错误的。总结虽然做得对。
答案 0 :(得分:5)
发生这种情况的原因是apply
需要矩阵和
> as.matrix(mydf)
weight RespID b1
1 "1.342888" "3182" "Mand"
2 "1.188844" "3183" "Kvinde"
3 "1.159798" "3184" "Mand"
4 "1.342888" "3185" "Mand"
5 "1.082853" "3186" "Kvinde"
6 "1.079329" "3187" "Kvinde"
7 "1.289133" "3188" "Mand"
8 "1.211910" "3189" "Kvinde"
9 "1.034386" "3190" "Kvinde"
10 "1.227196" "3191" "Kvinde"
您要使用的是sapply
:
> sapply(mydf,class)
weight RespID b1
"numeric" "numeric" "factor"
答案 1 :(得分:1)
apply
强制matrix
无法包含factor
,因此它与字符变量相同:
df <- data.frame( x = as.factor(letters[1:3]) , y = as.factor(LETTERS[1:3]) )
str(df)
'data.frame': 3 obs. of 2 variables:
$ x: Factor w/ 3 levels "a","b","c": 1 2 3
$ y: Factor w/ 3 levels "A","B","C": 1 2 3
apply(df,2,class)
x y
"character" "character"
sapply(df,class)
x y
"factor" "factor"
答案 2 :(得分:0)
我找到了原因。
似乎apply会将data.frame强制转换为矩阵,导致每列都存储为字符类型。然后,apply方法将在转换后报告类。诀窍是要认识到data.frame是一个美化的列表,因此
lapply(mydf, class)
# $weight
# [1] "numeric"
#
# $RespID
# [1] "numeric"
#
# $b1
# [1] "factor"
给出正确答案。