数据框使用head()列出并追加

时间:2018-11-25 00:54:35

标签: python list dataframe tensorflow

我想要做的是从我的8371本书的csv中挑选出最相关的6本书。我确实拿出了前6本书,我想做一个for循环,这样我就可以得到所有8371本书,并在每本书旁边附加6本书的相关清单

def tf_similarity(s1, s2):
def add_space(s):
    return ' '.join(list(s))

s1, s2 = add_space(s1), add_space(s2)

cv = CountVectorizer(tokenizer=lambda s: s.split())
corpus = [s1, s2]
vectors = cv.fit_transform(corpus).toarray()

return np.dot(vectors[0], vectors[1]) / (norm(vectors[0]) * norm(vectors[1]))

for j in range(0,1): 

    for i in range(8371):
        titles.append(tf_similarity(str(document[i]), document[j]))


    df = pd.DataFrame()
    df["document"] = document
    df["titles"] = titles
    dff = df.sort_values("titles", ascending=False).head(6)
    newdff=dff.values.T.tolist()
    new=newdff[0]
for k in range(0,2):
    boolist.append(new)

我希望列表看起来像这样:

book1 top1 top2 top3 top4 top5 top6
book2 top1 top2 top3 top4 top5 top6

0 个答案:

没有答案