我创建了两个子集(data.frames),如下所示:
sms_raw_train <- sms_raw[1:4169, ]
sms_raw_test <- sms_raw[4170:5559, ]
第一个sms_raw_train
,如下所示:
type text
1 ham Hope you are having a good week. Just checking in
2 ham K..give back my thanks.
3 ham Am also doing in cbe only. But have to pay.
第二个sms_raw_test
,如下所示:
row.names type text
1 4170 ham I'm coming home 4 dinner.
2 4171 ham Come by our room at some point so we can iron out the plan for this weekend
3 4172 ham Its sunny in california. The weather's just cool
如您所见,它添加了row.names
列。但是,如果我这样做:
> str(sms_raw_test[1:3, ])
'data.frame': 3 obs. of 2 variables:
$ type: Factor w/ 2 levels "ham","spam": 1 1 1
$ text: chr "I'm coming home 4 dinner." "Come by our room at some point so we can iron out the plan for this weekend" "Its sunny in california. The weather's just cool"
该列实际上并不存在。
这个专栏的目的是什么?为什么它被添加到View(sms_raw_train)
?
答案 0 :(得分:4)
View
正在添加该列以供显示。如您所见,它实际上并不存在于子集中。
来自help(View)
:
If there are row names on the data frame that are not 1:nrow, they are displayed in a separate first column called row.names.
sms_raw_data
的行名称(大概)4170:5559
。
sms_raw_train
的行名称为1:nrow
,因此此行为不明显。