人!我对Python有问题。有谁能够帮助我?我是python的初学者
我有一个带有信息的数据框,并且使用字符串字段。
该列的示例: Dataframe Column
代码是:
data = pd.read_csv("dataset.csv",sep=';',encoding='latin-1',error_bad_lines=False)
data['campo'].dropna(inplace=True)
data['campo'] = data['campo'].str.lstrip()
data['campo'] = data['campo'].str.rstrip()
data['campo'] = data['campo'].str.replace('ú','u')
data['campo'] = data['campo'].str.replace('ó','o')
data['campo'] = data['campo'].str.replace('í','i')
data['campo'] = data['campo'].str.replace('é','e')
data['campo'] = data['campo'].str.replace('á','a')
data['campo'] = data['campo'].str.lower()
data['campo'] = data['campo'].str.replace(r'[^\w\s]','')
data['campo']= data['campo'].str.split()
直到结果为: Preview
import nltk
nltk.download('stopwords')
stop_words = set(stopwords.words("spanish"))
#funcion
def remove_stops(row):
my_list = row['campo']
meaningful_words = [w for w in my_list if not w in stop_words]
return (meaningful_words)
data['campo'] = data.apply(remove_stops, axis=1)
Train_X, Test_X, Train_Y, Test_Y = model_selection.train_test_split(data['campo'],data['Target'],test_size=0.3)
Tfidf_vect = TfidfVectorizer(max_features=5000)
Tfidf_vect.fit(data['campo'])
然后,给我一个错误:
AttributeError:“ list”对象没有属性“ lower”
我不知道为什么。我是python的初学者,购买我不知道如何解决它。
如何解决?谢谢 。对不起我的英语!