pymc3在采样前改进了theano编译时间

时间:2016-04-18 14:26:58

标签: pymc theano

我正在使用这种分层贝叶斯模型:

import pymc3 as pm
import pandas as pd
import theano.tensor as T

categories = pd.Categorical(df.cat)
n_categories = len(set(categories.codes))
cat_idx = categories.codes

with pm.Model()
    mu_a = pm.Normal('mu_a', 0, sd=100**2)
    sig_a = pm.Uniform('sig_a', lower=0, upper=100)
    alpha = pm.Normal('alpha', mu=mu_a, sd=sig_a, shape=n_categories)

    betas = []
    for f in FEATURE_LIST:
        mu_b = pm.Normal('mu_b_%s' % f, 0, sd=100**2)
        sig_b = pm.Uniform('sig_b_%s' % f, lower=0, upper=100)
        betas.append(pm.Normal('beta_%s' % f, mu=mu_b, sd=sig_b, shape=n_categories))


    logit = 1.0 / (1.0 + T.exp(-(
                sum([betas[i][cat_idx] * X_train[f].values for i, f in enumerate(FEATURE_LIST)]) 
                + alpha[cat_idx]
            )))

    y_est = pm.Bernoulli('y_est', logit, observed=df.y)

    start = pm.find_MAP()
    trace = pm.sample(2000, pm.NUTS(), start=start, random_seed=42, njobs=40)

我会想象用适当的Theano代码(可能使用T.dot?)替换我的python前导列表和单独的加法和乘法将改善调用样本的性能。如何在Theano中正确设置?我想我需要为shape=(n_features, n_categories)做一些像我的先生一样的事情,但我不确定如何在点积中做类别索引。

0 个答案:

没有答案