我的输入为:
He is a boy. She is a girl.
He is mad. She is brainsick.
He made himself the king.Queen create her life luxurious.
Their Royal livelihood is good.
我正在矢量化单词并获得如下矩阵:
boy girl mad ... livelihood good
0 0.000000 0.000000 0.496683 ... 0.000000 0.000000 0.0
1 0.627543 0.000000 0.000000 ... 0.000000 0.000000 0.0
2 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.0
3 0.000000 0.000000 0.543939 ... 0.254394 0.424127 0.0
4 0.000000 0.000000 0.000000 ... 0.232293 0.000000 0.0
我尝试过这个公式
TF(t)=(术语t在文档中出现的次数)/(文档中术语的总数)。
我期望该值为0.627543,但根据公式计算,当我计算在纸上时,我得到0.34 ...某物。