有条件的不同群体的情节

时间:2018-06-06 03:15:24

标签: r

我正在研究工作中的数据科学项目,我的目标是提供庞大数据集的摘要。

例如,我想知道有多少客户订购过House Brand一次,两次,两次以上。 有多少人订购了自有品牌和非家居品牌? 只有非家居品牌订购了多少?

我怎样才能做到这一点?

样本数据集

PRODUCT_SUB_LINE_DESCR    MAJOR_CATEGORY_DESCR       CUST_REGION_DESCR
SUNDRY                         SMALL EQUIP           NORTH EAST REGION
SUNDRY                         SMALL EQUIP           SOUTH EAST REGION
SUNDRY                         SMALL EQUIP           SOUTH EAST REGION
SUNDRY                         SMALL EQUIP           NORTH EAST REGION
SUNDRY                         PREVENTIVE            SOUTH CENTRAL REGION
SUNDRY                         PREVENTIVE            SOUTH EAST REGION
SUNDRY                         PREVENTIVE            SOUTH EAST REGION
SUNDRY                         SMALL EQUIP           NORTH CENTRAL REGION
SUNDRY                         SMALL EQUIP           MOUNTAIN WEST REGION
SUNDRY                         SMALL EQUIP           MOUNTAIN WEST REGION
SUNDRY                         COMPOSITE             NORTH CENTRAL REGION
SUNDRY                         COMPOSITE             NORTH CENTRAL REGION
SUNDRY                         COMPOSITE             OHIO VALLEY REGION
SUNDRY                         COMPOSITE             NORTH EAST REGION

Sales   QtySold      MFGCOST    MarginDollars   new_ProductName
209.97  3             134.55    72.72            no
-76.15  -1            -44.85    -30.4            no
275.6   2             162.5     109.84           no
138.7   1             81.25     55.82            no
226     2             136       87.28            no
115     1             68        45.64            no
210.7   2             136       71.98            no
29      1             18.85     9.77             no
29      1             18.85     9.77             no
46.32   2             37.7      7.86             no
159.86  1             132.4     24.81            no
441.3   2             264.8     171.2            no
209.62  1             132.4     74.57            no
209.62  1             132.4     74.57            no

这不是原始数据集。我基本上在我的决策树分析的原始数据集中添加了一个新列。但就目前而言,我想在这里制作一些情节。自有品牌被认为是House Brand。

 new_ProductName = ifelse( PRODUCT_SUB_LINE_DESCR == "PRIVATE 
                          LABEL","yes","no")
 data = data.frame(new_Dataset, new_ProductName)

问题:

> group_by_region = data %>% group_by(PRODUCT_SUB_LINE_DESCR, 
      CUST_REGION_DESCR) %>% summarise(count=n(), sales=sum(Sales))

> mytable = table(group_by_region)
> barplot(mytable)
Error in barplot.default(mytable) : 'height' must be a vector or a matrix

0 个答案:

没有答案