library(ROCR)
pred1 <- prediction(predictions=glm.prob2,labels =test_data$Direction)
perf1<-performance(pred1,measure = "TP.rate",x.measure = "FP.rate")
plot(perf1)
我不断收到以下错误消息:
Wrong argument types: First argument must be of type 'prediction'; second and optional third argument must be available performance measures!
如何获得roc曲线?
答案 0 :(得分:0)
错误提示,您的measure
和x.measure
参数无效。
The documentation of the performance
function列出了以下选项供您选择:
‘acc’: Accuracy. P(Yhat = Y). Estimated as: (TP+TN)/(P+N).
‘err’: Error rate. P(Yhat != Y). Estimated as: (FP+FN)/(P+N).
‘fpr’: False positive rate. P(Yhat = + | Y = -). Estimated as:
FP/N.
‘fall’: Fallout. Same as ‘fpr’.
‘tpr’: True positive rate. P(Yhat = + | Y = +). Estimated as:
TP/P.
‘rec’: Recall. Same as ‘tpr’.
‘sens’: Sensitivity. Same as ‘tpr’.
‘fnr’: False negative rate. P(Yhat = - | Y = +). Estimated as:
FN/P.
‘miss’: Miss. Same as ‘fnr’.
‘tnr’: True negative rate. P(Yhat = - | Y = -).
‘spec’: Specificity. Same as ‘tnr’.
‘ppv’: Positive predictive value. P(Y = + | Yhat = +). Estimated
as: TP/(TP+FP).
‘prec’: Precision. Same as ‘ppv’.
‘npv’: Negative predictive value. P(Y = - | Yhat = -). Estimated
as: TN/(TN+FN).
‘pcfall’: Prediction-conditioned fallout. P(Y = - | Yhat = +).
Estimated as: FP/(TP+FP).
‘pcmiss’: Prediction-conditioned miss. P(Y = + | Yhat = -).
Estimated as: FN/(TN+FN).
‘rpp’: Rate of positive predictions. P(Yhat = +). Estimated as:
(TP+FP)/(TP+FP+TN+FN).
‘rnp’: Rate of negative predictions. P(Yhat = -). Estimated as:
(TN+FN)/(TP+FP+TN+FN).
‘phi’: Phi correlation coefficient. (TP*TN -
FP*FN)/(sqrt((TP+FN)*(TN+FP)*(TP+FP)*(TN+FN))). Yields a
number between -1 and 1, with 1 indicating a perfect
prediction, 0 indicating a random prediction. Values below 0
indicate a worse than random prediction.
‘mat’: Matthews correlation coefficient. Same as ‘phi’.
‘mi’: Mutual information. I(Yhat, Y) := H(Y) - H(Y | Yhat), where
H is the (conditional) entropy. Entropies are estimated
naively (no bias correction).
‘chisq’: Chi square test statistic. ‘?chisq.test’ for details.
Note that R might raise a warning if the sample size is too
small.
‘odds’: Odds ratio. (TP*TN)/(FN*FP). Note that odds ratio produces
Inf or NA values for all cutoffs corresponding to FN=0 or
FP=0. This can substantially decrease the plotted cutoff
region.
‘lift’: Lift value. P(Yhat = + | Y = +)/P(Yhat = +).
‘f’: Precision-recall F measure (van Rijsbergen, 1979). Weighted
harmonic mean of precision (P) and recall (R). F = 1/
(alpha*1/P + (1-alpha)*1/R). If alpha=1/2, the mean is
balanced. A frequent equivalent formulation is F = (beta^2+1)
* P * R / (R + beta^2 * P). In this formulation, the mean is
balanced if beta=1. Currently, ROCR only accepts the alpha
version as input (e.g. alpha=0.5). If no value for alpha is
given, the mean will be balanced by default.
‘rch’: ROC convex hull. A ROC (=‘tpr’ vs ‘fpr’) curve with
concavities (which represent suboptimal choices of cutoff)
removed (Fawcett 2001). Since the result is already a
parametric performance curve, it cannot be used in
combination with other measures.
‘auc’: Area under the ROC curve. This is equal to the value of the
Wilcoxon-Mann-Whitney test statistic and also the probability
that the classifier will score are randomly drawn positive
sample higher than a randomly drawn negative sample. Since
the output of ‘auc’ is cutoff-independent, this measure
cannot be combined with other measures into a parametric
curve. The partial area under the ROC curve up to a given
false positive rate can be calculated by passing the optional
parameter ‘fpr.stop=0.5’ (or any other value between 0 and 1)
to ‘performance’.
‘prbe’: Precision-recall break-even point. The cutoff(s) where
precision and recall are equal. At this point, positive and
negative predictions are made at the same rate as their
prevalence in the data. Since the output of ‘prbe’ is just a
cutoff-independent scalar, this measure cannot be combined
with other measures into a parametric curve.
‘cal’: Calibration error. The calibration error is the absolute
difference between predicted confidence and actual
reliability. This error is estimated at all cutoffs by
sliding a window across the range of possible cutoffs. The
default window size of 100 can be adjusted by passing the
optional parameter ‘window.size=200’ to ‘performance’. E.g.,
if for several positive samples the output of the classifier
is around 0.75, you might expect from a well-calibrated
classifier that the fraction of them which is correctly
predicted as positive is also around 0.75. In a
well-calibrated classifier, the probabilistic confidence
estimates are realistic. Only for use with probabilistic
output (i.e. scores between 0 and 1).
‘mxe’: Mean cross-entropy. Only for use with probabilistic output.
MXE := - 1/(P+N) sum_{y_i=+} ln(yhat_i) + sum_{y_i=-}
ln(1-yhat_i). Since the output of ‘mxe’ is just a
cutoff-independent scalar, this measure cannot be combined
with other measures into a parametric curve.
‘rmse’: Root-mean-squared error. Only for use with numerical class
labels. RMSE := sqrt(1/(P+N) sum_i (y_i - yhat_i)^2). Since
the output of ‘rmse’ is just a cutoff-independent scalar,
this measure cannot be combined with other measures into a
parametric curve.
‘sar’: Score combinining performance measures of different
characteristics, in the attempt of creating a more "robust"
measure (cf. Caruana R., ROCAI2004): SAR = 1/3 * ( Accuracy +
Area under the ROC curve + Root mean-squared error ).
‘ecost’: Expected cost. For details on cost curves, cf.
Drummond&Holte 2000,2004. ‘ecost’ has an obligatory x axis,
the so-called 'probability-cost function'; thus it cannot be
combined with other measures. While using ‘ecost’ one is
interested in the lower envelope of a set of lines, it might
be instructive to plot the whole set of lines in addition to
the lower envelope. An example is given in ‘demo(ROCR)’.
‘cost’: Cost of a classifier when class-conditional
misclassification costs are explicitly given. Accepts the
optional parameters ‘cost.fp’ and ‘cost.fn’, by which the
costs for false positives and negatives can be adjusted,
respectively. By default, both are set to 1.
因此,您应该执行以下操作:
perf1 <- performance(pred1, measure = "tpr", x.measure = "fpr")