Question

我知道有很多问题都存在相同的问题，但是没有一个可以解决我的问题。我在Amazon Sagemaker中使用Jupyter Notebook，并且我想对某些功能使用哈希技巧。我无法使用简单数据制作可复制的示例，但这是我拥有的数据的屏幕：

所以我用过：

from sklearn.feature_extraction import FeatureHasher
h = FeatureHasher(n_features=10,input_type="string")
df['country_iso_code'] = h.transform(df['country_iso_code'])
h = FeatureHasher(n_features=10,input_type="string")
df['origen_tarjeta_country_iso'] = h.transform(df['origen_tarjeta_country_iso'])

第一个转换有效，但是第二个转换不起作用，并且我得到的“ float”对象不是可重复的错误。我检查了两列的类型，它们都是对象，并且我检查了两列中只有字符串。我试图用很少的样本重现Spyder中的代码，并且可以正常工作：

import pandas as pd
from sklearn.feature_extraction import FeatureHasher

df = pd.DataFrame({'ES':'ES','UK':'UK'},index=[0,1])
h = FeatureHasher(n_features=10,input_type="string")
df['UK'] = h.transform(df['UK'])
h = FeatureHasher(n_features=10,input_type="string")
df['ES'] = h.transform(df['ES'])

sklearn

0 个答案: