将dplyr :: tbl_df包装器与gsub函数组合时的奇怪行为

时间:2016-01-03 19:07:29

标签: r dplyr

我有一些代码循环遍历所有列,并在我定义的gsub()语句后清除特定的字符串(见下文)。奇怪的是,dplyr包中的tbl_df类使gsub表现得很奇怪。

如果没有先明确地传递我的列as.data.framegsub语句就无法正常运行,并且会给出完全错误的返回值(参见下文)。是什么导致了这种行为?

我已将tbl_df对象转换为as.data.frame,然后在gsub内使用它,之后绕过了这个问题。

我的代码(通过调试器访问,因此Browse >语句

Browse[1]> x[,1]
Source: local data frame [70 x 1]

        Symbol
         (chr)
1       AAK.ST
2       ABB.ST
3      ALFA.ST
4  ALIV-SDB.ST
5       AOI.ST
6    ATCO-A.ST
7      AXFO.ST
8      AXIS.ST
9       AZN.ST
10   BALD-B.ST
..         ...
Browse[1]> gsub(pattern = '[-.](.*)$', replacement = '', x = x[,1])
[1] "c(\"AAK" # Wrong behaviour
Browse[1]> gsub(pattern = '[-.](.*)$', replacement = '', x = as.data.frame(x[,1]))
[1] "c(\"AAK" # Still wrong behaviour
Browse[1]> y  <- as.data.frame(x[,1])
Browse[1]> gsub(pattern = '[-.](.*)$', replacement = '', x = y[,1]) # Now it's right(!)
 [1] "AAK"   "ABB"   "ALFA"  "ALIV"  "AOI"   "ATCO"  "AXFO"  "AXIS"  "AZN"   "BALD"  "BETS"  "BILL"  "BOL"   "CAST"  "COMH"  "EKTA"  "ELUX"  "ENQ"   "ERIC"  "FABG"  "GETI"  "HEXA"  "HM"    "HOLM"  "HPOL"  "HUFV" 
[27] "HUSQ"  "ICA"   "IJ"    "INDT"  "INDU"  "INVE"  "JM"    "KINV"  "LATO"  "LIFCO" "LOOM"  "LUMI"  "LUND"  "LUPE"  "MEDA"  "MELK"  "MIC"   "MTG"   "NCC"   "NDA"   "NIBE"  "NOBI"  "ORI"   "PEAB"  "RATO"  "SAAB" 
[53] "SAND"  "SCA"   "SEB"   "SECU"  "SHB"   "SKA"   "SKF"   "SOBI"  "SSAB"  "STE"   "SWED"  "SWMA"  "TEL2"  "TIEN"  "TLSN"  "TREL"  "VOLV"  "WALL"  

0 个答案:

没有答案