Question

library(dummies)    
combi <- dummy.data.frame(combi, names = c('Outlet_Size','Outlet_Location_Type','Outlet_Type', 'Item_Type_New'),  sep='_')

在RStudio中运行上面的代码会返回以下错误：

sort.list（y）中的错误：＆＃39; x＆＃39;对于＆＃39; sort.list＆＃39;必须是原子的你有没有打过电话＆＃39;排序＆＃39;在列表上？

数据集如下：

 head(combi)
# A tibble: 6 x 15
  Item_Identifier Item_Count Outlet_Identifier Outlet_Count Item_Weight Item_Fat_Content Item_Visibility Item_Type Item_MRP Outlet_Establishme~
  <fct>                <int> <fct>                    <int>       <dbl>            <dbl>           <dbl> <fct>        <dbl>               <int>
1 DRA12                    9 OUT010                     925        11.6               0.          0.0685 Soft Dri~     143.                1998
2 DRA12                    9 OUT013                    1553        11.6               0.          0.0409 Soft Dri~     142.                1987
3 DRA12                    9 OUT017                    1543        11.6               0.          0.0412 Soft Dri~     140.                2007
4 DRA12                    9 OUT018                    1546        11.6               0.          0.0411 Soft Dri~     142.                2009
5 DRA12                    9 OUT027                    1559        12.6               0.          0.0407 Soft Dri~     140.                1985
6 DRA12                    9 OUT035                    1550        11.6               0.          0.0540 Soft Dri~     142.                2004
# ... with 5 more variables: Outlet_Size <fct>, Outlet_Location_Type <fct>, Outlet_Type <fct>, Item_Outlet_Sales <dbl>, Item_Type_New <chr>

组合结构：

    > str(combi)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   14204 obs. of  15 variables:
 $ Item_Identifier          : Factor w/ 1559 levels "DRA12","DRA24",..: 1 1 1 1 1 1 1 1 1 2 ...
 $ Item_Count               : int  9 9 9 9 9 9 9 9 9 10 ...
 $ Outlet_Identifier        : Factor w/ 10 levels "OUT010","OUT013",..: 1 2 3 4 6 7 8 9 10 1 ...
 $ Outlet_Count             : int  925 1553 1543 1546 1559 1550 1548 1550 1550 925 ...
 $ Item_Weight              : num  11.6 11.6 11.6 11.6 12.6 ...
 $ Item_Fat_Content         : num  0 0 0 0 0 0 0 0 0 1 ...
 $ Item_Visibility          : num  0.0685 0.0409 0.0412 0.0411 0.0407 ...
 $ Item_Type                : Factor w/ 16 levels "Baking Goods",..: 15 15 15 15 15 15 15 15 15 15 ...
 $ Item_MRP                 : num  143 142 140 142 140 ...
 $ Outlet_Establishment_Year: int  1998 1987 2007 2009 1985 2004 2002 1997 1999 1998 ...
 $ Outlet_Size              : Factor w/ 4 levels "Other","High",..: 1 2 1 3 3 4 1 4 3 1 ...
 $ Outlet_Location_Type     : Factor w/ 3 levels "Tier 1","Tier 2",..: 3 3 2 3 3 2 2 1 1 3 ...
 $ Outlet_Type              : Factor w/ 4 levels "Grocery Store",..: 1 2 2 3 4 2 2 2 2 1 ...
 $ Item_Outlet_Sales        : num  284 2553 2553 851 1 ...
 $ Item_Type_New            : chr  "Drink" "Drink" "Drink" "Drink" ...

Answer 1

将tibble转换为data.frame会有所帮助。

library(dummies)
library(dplyr)

combi %>%
  data.frame() %>%
  dummy.data.frame(names = c('Outlet_Size', 'Outlet_Location_Type', 'Outlet_Type', 'Item_Type_New'),  sep='_')

使用以下代码可以重现错误：

library(dummies)
library(dplyr)
library(tibble)

iris %>%
  as.tibble() %>%
  #data.frame() %>%
  dummy.data.frame(names = c('Species'),  sep='_')

#Error in sort.list(y) : 'x' must be atomic for 'sort.list'
#Have you called 'sort' on a list?

如果您取消注释data.frame()步骤，此错误将会消失。

sort.list（y）出错：＆＃39; x＆＃39;对于＆＃39; sort.list＆＃39;必须是原子的你有没有打过电话＆＃39;排序＆＃39;在名单上？

1 个答案: