我的str(df)如下所示:
> str(categoricalVar)
'data.frame': 56660 obs. of 10 variables:
$ FavouriteSource : Factor w/ 3 levels "App","LF","None": 1 1 3 3 3 1 3 3 3 3 ...
$ FavouriteSource30 : Factor w/ 3 levels "App","LF","None": 1 1 3 3 3 1 3 3 3 3 ...
$ FavouriteSource90 : Factor w/ 3 levels "App","LF","None": 3 3 3 3 3 3 3 3 3 3 ...
$ FavouriteSource180 : Factor w/ 3 levels "App","LF","None": 3 3 3 3 3 3 3 3 3 3 ...
$ FavouriteSource360 : Factor w/ 3 levels "App","LF","None": 3 3 3 3 3 3 3 3 3 3 ...
$ Favorite_GameBin : Factor w/ 594 levels " Team Umizoomi: Street Fair Fix -Up (Explorer)",..: 262 163 388 378 378 220 253 378 378 378 ...
$ Favorite_GameBin30 : Factor w/ 309 levels "1-2-3 Dora!",..: 191 191 191 191 191 191 191 191 191 191 ...
$ Favorite_GameBin90 : Factor w/ 332 levels "1-2-3 Dora!",..: 206 206 206 206 206 206 206 206 206 206 ...
$ Favorite_GameBin180: Factor w/ 363 levels "1-2-3 Dora!",..: 226 226 226 226 226 226 226 226 226 226 ...
$ Favorite_GameBin360: Factor w/ 449 levels " Team Umizoomi: Street Fair Fix -Up (Explorer)",..: 283 283 283 283 283 283 283 283 283 283 ...
>
我正在尝试将它们弄模糊,但是,它会抛出如下错误:
> categoricalVar_dummy <- dummy(categoricalVar)
Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?
我做错了什么?
答案 0 :(得分:0)
这是使用dummies
包的两个解决方案。我无法从您的问题中看到dummy
电话是否来自dummies
个套餐。无论如何,
首先是一些数据,
categoricalVar <- data.frame(
FavouriteSource = c('bar', 'foo', 'foo', 'foobar', 'foo', 'foo'),
FavouriteSource30 = c('A', 'C', 'C', 'B', 'B', 'A')); categoricalVar
#> FavouriteSource FavouriteSource30
#> 1 bar A
#> 2 foo C
#> 3 foo C
#> 4 foobar B
#> 5 foo B
#> 6 foo A
然后加载dummies
库
# install.packages(c("dummies"), dependencies = TRUE)
library(dummies)
这里是获取假人的dummy.data.frame()
方法,
dummy.data.frame(categoricalVar)
#> FavouriteSourcebar FavouriteSourcefoo FavouriteSourcefoobar FavouriteSource30A
#> 1 1 0 0 1
#> 2 0 1 0 0
#> 3 0 1 0 0
#> 4 0 0 1 0
#> 5 0 1 0 0
#> 6 0 1 0 1
#> FavouriteSource30B FavouriteSource30C
#> 1 0 0
#> 2 0 1
#> 3 0 1
#> 4 1 0
#> 5 1 0
#> 6 0 0
或as Sathish suggest in the comment above,
lapply(categoricalVar, dummy)
#> $FavouriteSource
#> categoricalVarbar categoricalVarfoo categoricalVarfoobar
#> [1,] 1 0 0
#> [2,] 0 1 0
#> [3,] 0 1 0
#> [4,] 0 0 1
#> [5,] 0 1 0
#> [6,] 0 1 0
#>
#> $FavouriteSource30
#> categoricalVarA categoricalVarB categoricalVarC
#> [1,] 1 0 0
#> [2,] 0 0 1
#> [3,] 0 0 1
#> [4,] 0 1 0
#> [5,] 0 1 0
#> [6,] 1 0 0