将View与data.frame子集一起使用会添加row.names列

时间:2014-05-09 14:06:05

标签: r dataframe

我创建了两个子集(data.frames),如下所示:

sms_raw_train <- sms_raw[1:4169, ]
sms_raw_test <- sms_raw[4170:5559, ]

第一个sms_raw_train,如下所示:

    type    text
1   ham Hope you are having a good week. Just checking in
2   ham K..give back my thanks.
3   ham Am also doing in cbe only. But have to pay.

第二个sms_raw_test,如下所示:

    row.names   type    text
1   4170    ham I'm coming home 4 dinner.
2   4171    ham Come by our room at some point so we can iron out the plan for this weekend
3   4172    ham Its sunny in california. The weather's just cool

如您所见,它添加了row.names列。但是,如果我这样做:

> str(sms_raw_test[1:3, ])
'data.frame':   3 obs. of  2 variables:
 $ type: Factor w/ 2 levels "ham","spam": 1 1 1
 $ text: chr  "I'm coming home 4 dinner." "Come by our room at some point so we can iron out the plan for this weekend" "Its sunny in california. The weather's just cool"

该列实际上并不存在。

这个专栏的目的是什么?为什么它被添加到View(sms_raw_train)

1 个答案:

答案 0 :(得分:4)

View正在添加该列以供显示。如您所见,它实际上并不存在于子集中。

来自help(View)

If there are row names on the data frame that are not 1:nrow, they are displayed in a separate first column called row.names.

sms_raw_data的行名称(大概)4170:5559

sms_raw_train的行名称为1:nrow,因此此行为不明显。