基于来自样本和子样本的数据的子样本的事件的概率以及代表性样本的更好选择

时间:2018-06-28 06:16:01

标签: statistics sample sampling resampling statistical-sampling

我正面临以下问题。我有以下示例(伪数据):

GROUP:
Year | Guys paid invoice | Guys did not paid invoice
2012 | 12.006 | 581
2013 | 13.087 | 610
2014 | 12.388 | 220
2015 |  9.998 | 350
2016 | 20.310 | 907
2017 | 15.032 | 503

SUBGROUP 1:
Year | Guys paid invoice | Guys did not paid invoice
2012 | 1.006 | 78
2013 |   938 | 21
2014 | 2.388 | 32
2015 | 1.998 | 45
2016 | 3.310 | 21
2017 | 5.032 | 50

SUBGROUP 2:
Year | Guys paid invoice | Guys did not paid invoice
2012 | 3.006 | 32
2013 |   438 | 12
2014 | 1.300 | 43
2015 |   801 | 21
2016 | 1.002 | 12
2017 | 3.100 | 32

SUBGROUP 3:
Year | Guys paid invoice | Guys did not paid invoice
2012 | 0 | 0
2013 | 0 | 0
2014 | 0 | 0
2015 | 0 | 1
2016 | 0 | 0
2017 | 0 | 0

我可以根据这些数据说,类似GROUP的Guy将来将以SUM(Guys已支付发票)/(SUM(Guys已支付发票)+ SUM(Guys未支付发票)的概率来支付他的下一张发票。 ?

SUBGROUP1的家伙下次将支付发票的可能性是什么?

SUBGROUP1和SUBGROUP2的人下次支付发票的可能性是什么?

数据是否足以说明可以根据该数据计算某人支付发票的概率?我需要什么数据才能更好地估计将来支付发票的可能性?

我可以确定SUBGROUP3的那个人下次不会付款吗?如何统计SUBGROUP3的家伙下次不支付发票的可能性?我需要关于该家伙的其他哪些数据才能说出更具代表性的话?

0 个答案:

没有答案