我有一些句子,从我想要分隔单词的句子中获取行向量。但是这些单词正在重复以匹配我不想要的最大句子的行向量。我想无论句子有多大,每个句子的行向量只会是一次单词。
>>> d = {tuple(RND.sample(range(100), 2)) for c in range(5)}
>>> d
{(17, 53), (74, 5), (88, 11), (21, 56), (15, 78)}
>>> type(d)
<class 'set'>
>>> a = (15, 78)
>>> a in d
True
>>> b = (32, 6)
>>> b in d
False
>>> d.add((4, 1))
>>> d
{(74, 5), (15, 78), (17, 53), (88, 11), (4, 1), (21, 56)}
我的意思是不重复。
答案 0 :(得分:2)
来自 rawr 的解决方案,
/Applications/Firefox.app/Contents/MacOS/firefox-bin
sentence <- c("case sweden", "meeting minutes ht board meeting st march now also attachment added agenda today s board meeting", "draft meeting minutes board meeting final meeting minutes ht board meeting rd april")
dd <- read.table(text = paste(sentence, collapse = '\n'), fill = TRUE)
test <- cbind(sentence, dd)
感谢。