标签: r hash count one-hot-encoding
我有一个带有文本特征的数据框。每个观察都存储为一个字符串向量。例如,
featureX c("aa","bb","cc") c("abc","dd") c("asa","bcb","ccsac","vd","vdsvs") c("dd","ee") ...
我想在 R 中执行以下任务:
featureX
dd
实际数据集很大,大约有 100000 个观测值。非常感谢任何参考。