如何处理此警告?
Warning (from warnings module):
File "C:\Users\SAMSUNG\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\linear_model\_logistic.py", line 762
extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)
ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
urls_data = pd.read_csv("data.csv")
TEST_SIZE = 0.001
type(urls_data)
urls_data.head()
def makeTokens(f):
tkns_BySlash = str(f.encode('utf-8')).split('/')
total_Tokens = []
for i in tkns_BySlash :
tokens = str(i).split('-')
tkns_ByDot = []
for j in range(0, len(tokens)):
temp_Tokens = str(tokens[j]).split('.')
tkns_ByDot = tkns_ByDot + temp_Tokens
total_Tokens = total_Tokens + tokens + tkns_ByDot
total_Tokens = list(set(total_Tokens))
if 'com' in total_Tokens:
total_Tokens.remove('com')
return total_Tokens
y = urls_data["label"]
url_list = urls_data["url"]
#Data Preprocessing
vectorizer = TfidfVectorizer(tokenizer=makeTokens)
X = vectorizer.fit_transform(url_list)
#Split Train set and Test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = TEST_SIZE, random_state = 42)
###############################logit
logit = LogisticRegression()
logit.fit(X_train, y_train)
这是我的代码
答案 0 :(得分:7)
更改
names(dimnames(arr1)) <- NULL
到
logit = LogisticRegression()
然后重试。
(logit = LogisticRegression(max_iter=10000)
的默认参数max_iter
等于1000,因此任何大于1000的数字都是可以的,不一定是10000)
您也可以使用警告语LogisticRegression()
来标准化您的数据。