我有一个数据集,假设是beta分布。
我从
知道
docs of Scipy认为Beta发行版的 PDF 等于:
Brand | Model | Year | Cost | Mean | Tax | Result
------------------------------------------------------------
Apple | iPhone 7 | 2017 | $1000 | $900 | $100 | $1000
Apple | iphone 7 | 2018 | $800 | $900 | $80 | $980
Xiomi | Note 5 | 2017 | $300 | $250 | $30 | $280
Xiomi | Note 5 | 2018 | $200 | $250 | $25 | $275
接下来,我使用beta.fit-function检查哪些参数值最适合我的数据集:
[username@hostname ~]$ nodetool tpstats
Pool Name Active Pending Completed Blocked All time blocked
ReadStage 0 0 110336013 0 0
ContinuousPagingStage 0 0 31 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 4244757 0 0
MutationStage 0 0 25309020 0 0
GossipStage 0 0 2484700 0 0
RequestResponseStage 0 0 46705216 0 0
ReadRepairStage 0 0 2193356 0 0
CounterMutationStage 0 0 3563130 0 0
MemtablePostFlush 0 0 117717 0 0
ValidationExecutor 1 1 111176 0 0
MemtableFlushWriter 0 0 23843 0 0
ViewMutationStage 0 0 0 0 0
CacheCleanupExecutor 0 0 0 0 0
Repair#1953 1 3 1 0 0
MemtableReclaimMemory 0 0 28251 0 0
PendingRangeCalculator 0 0 6 0 0
AntiCompactionExecutor 0 0 0 0 0
SecondaryIndexManagement 0 0 0 0 0
HintsDispatcher 0 0 29 0 0
Native-Transport-Requests 0 0 110953286 0 0
MigrationStage 0 0 19 0 0
PerDiskMemtableFlushWriter_0 0 0 27853 0 0
Sampler 0 0 0 0 0
InternalResponseStage 0 0 21264 0 0
AntiEntropyStage 0 0 350913 0 0
Message type Dropped Latency waiting in queue (micros)
50% 95% 99% Max
READ 0 0.00 0.00 0.00 10090.81
RANGE_SLICE 0 0.00 0.00 10090.81 10090.81
_TRACE 0 N/A N/A N/A N/A
HINT 0 0.00 0.00 0.00 0.00
MUTATION 0 0.00 0.00 0.00 10090.81
COUNTER_MUTATION 0 0.00 0.00 0.00 10090.81
BATCH_STORE 0 0.00 0.00 0.00 0.00
BATCH_REMOVE 0 0.00 0.00 0.00 0.00
REQUEST_RESPONSE 0 0.00 0.00 0.00 12108.97
PAGED_RANGE 0 N/A N/A N/A N/A
READ_REPAIR 0 0.00 0.00 0.00 0.00```
这给了我以下输出:
pdf = (gamma(a + b) * x ** (a - 1) * (1 - x) ** (b - 1)) / (gamma(a) * gamma(b))
因此, a 等于〜4.66, b 等于〜36.10。
现在,我想编写自己的拟合函数:
from scipy.stats import beta as beta_stats
parameters_beta = beta_stats.fit(players['Conversion'], floc=0, fscale=1)
print(parameters_beta)
运行此命令可获得以下结果:
(4.6572518152560525, 36.09719571353918, 0, 1)
因此, a 等于0, b 等于〜426。 为什么差异如此之大?我在脚本中做错了吗?
谢谢!