余弦距离不适用于SparseDataFrame Pandas

时间:2017-05-21 18:00:40

标签: python-3.x pandas cosine-similarity

有人可以向我解释为什么这个有效,另一个不是?当我改为SparseDataFrame

data = pd.read_csv('/.../.csv').astype(float)

data_germany = data.drop('user', 1)

data_ibs = pd.DataFrame(index=data_germany.columns,columns=data_germany.columns)

for i in tqdm(range(0,len(data_ibs.columns))) :
    # Loop through the columns for each column
    for j in range(0,len(data_ibs.columns)) :
      # Fill in placeholder with cosine similarities
      data_ibs.iloc[i,j] = 1-cosine(data_germany.iloc[:,i],data_germany.iloc[:,j])

#######################################################

data = pd.read_csv('/.../.csv').astype(float)**.to_sparse()**

data_germany = data.drop('user', 1)

data_ibs = pd.**Sparse**DataFrame(index=data_germany.columns,columns=data_germany.columns)

for i in tqdm(range(0,len(data_ibs.columns))) :
    # Loop through the columns for each column
    for j in range(0,len(data_ibs.columns)) :
      # Fill in placeholder with cosine similarities
      data_ibs.iloc[i,j] = 1-cosine(data_germany.iloc[:,i],data_germany.iloc[:,j])

正确计算余弦距离,直到我尝试将数字分配给数据帧

怎么可能?

感谢您提前帮助

0 个答案:

没有答案