Question

我有一个数据集，我试图测试7天的广告期是否比5天的广告期更好。我觉得逻辑回归是测试这个问题的最好方法。我跑了两个星期的测试。我有流量，注册，消耗等数据。

以下是数据的样子：

              5d         7d  greater (does the 7d have atleast 5% more than 5d)
Traffic     179650  196395   1
subscribers 437899  442068   0
attrition   2304    2376     0
signups     5039    6246     1

1表示是，0表示否。

我在R：

中运行了这段代码

fit2<-glm(greater~X5d + X7d, data=logr2, family = "binomial")

然后

predict(fit2, data=logr2, type = "response")

我的输出是：

 1            2            3            4 
1.000000e+00 6.753019e-13 1.386707e-10 1.000000e+00

或

> round(predict(fit2, data=logr2, type = "response"))
1 2 3 4 
1 0 0 1

我如何运行它，以便我只得到1输出告诉我1或0（IE 7天的总体增加超过5％？）

由于

Answer 1

我认为你混淆了predict函数的参数名称（参见documentation），试试这个：

predict(fit2, newdata=logr2, type = "response")

奇怪的输出来自于你将训练数据作为预测输入的事实，这实际上没有意义。尝试一些新的数据点，如下所示：

input = data.frame(X5d = 123, X7d = 22)
predict(fit2, newdata=logr2, type = "response")

结果：

1

2.775557e-13

表示1的概率几乎为0。

如果您从数据集中给出一个确切的点：

input = data.frame(X5d = 179650, X7d = 196395)
predict(fit2, newdata=input, type = "response")

结果：

1

1

因此1概率为1。

您可以检查训练集中的其他数据点 - 结果是完美的，因为对于这么少的训练数据样本，您的拟合是理想的。

您可以找到一个简单的类似示例here。

R中广告分析的逻辑回归

1 个答案: