My Understanding is Softmax Regression is generalization of Logistic Regression to support multiple classes .
Softmax Regression model first computes a score for each class then estimates the probability of each class by applying the softmax function to the scores.
Each class has its own dedicated parameter vector
My question : Why can't we use Logistic Regression to classify to multiple classes in a much simpler way like if probability is 0 to 0.3 then Class A ; 0.3 to 0.6 then Class B : 0.6 to 0.9 then Class C etc.
Why separate coefficient vector is always needed ?
I'm new to ML . Not sure if this question is due to lack of any fundamental concept understanding .
答案 0 :(得分:1)
首先,就术语而言,我说更成熟的术语是multinomial logistic regression。
Softmax函数是计算概率的自然选择,因为它corresponds to MLE。 Cross-entropy loss也有一个概率解释 - 即#"距离"两个分布之间(输出和目标)。 你建议的是以人为的方式区分类 - 输出二进制分布并以某种方式将它与多类分布进行比较。从理论上讲,这是可能的,也可能有效,但肯定有缺点。例如,训练更难。
假设输出为0.2
(即A类),基本事实为B类。您想告诉网络向更高的值转移。下一次,输出为0.7
- 网络实际学习并向正确的方向移动,但您再次惩罚它。实际上,在您的示例中存在不稳定点(0.3
和0.6
)网络需要时间来学习作为关键点。两个值 - 0.2999999
和0.3000001
几乎无法区分网络,但它们会确定结果是否正确。
一般而言,作为概率分布的输出总是优于直接鉴别,因为它提供了更多信息。