AFT生存模型Pyspark中的生存概率

时间:2020-07-22 07:56:50

标签: pyspark survival-analysis

我想知道如何使用AFTSurvivalRegression方法来计算pyspark中的生存概率。我在网上看到了这个例子:

from pyspark.ml.regression import AFTSurvivalRegression
from pyspark.ml.linalg import Vectors

training = spark.createDataFrame([
    (1.218, 1.0, Vectors.dense(1.560, -0.605)),
    (2.949, 0.0, Vectors.dense(0.346, 2.158)),
    (3.627, 0.0, Vectors.dense(1.380, 0.231)),
    (0.273, 1.0, Vectors.dense(0.520, 1.151)),
    (4.199, 0.0, Vectors.dense(0.795, -0.226))], ["label", "censor", "features"])
quantileProbabilities = [0.3, 0.6]
aft = AFTSurvivalRegression(quantileProbabilities=quantileProbabilities,
                            quantilesCol="quantiles")

model = aft.fit(training)

# Print the coefficients, intercept and scale parameter for AFT survival regression
print("Coefficients: " + str(model.coefficients))
print("Intercept: " + str(model.intercept))
print("Scale: " + str(model.scale))
model.transform(training).show(truncate=False)

但是,我只能预测生存时间。我也可以获得分位数概率,但我不知道它们是如何工作的。我的问题是如何获得一个人在特定时间生存的概率?

0 个答案:

没有答案