因此,即时通讯正在制作一种python文本编辑器,我希望脚本扫描文本以查找特定于som的单词,然后更改单词的颜色(例如在pycharm中) 像这样的东西:
RANDOM_ST = 123
def featureSelection(train, train_labels, test, test_labels):
# Use kNN to illustrate effectiveness of feature selection.
clf = KNeighborsClassifier()
# train the classifier
clf = clf.fit(train, train_labels['gname_code'])
# predict the class for unseen examples
preds = clf.predict(test)
# initial accuracy
score = metrics.accuracy_score(preds, test_labels['gname_code'])
print('Initial Result', score)
# Decision tree for feature selection
# RF is probably a better way to do feature selection but I want it to be deterministic for
# comparing unblanaced methods later. So use decTree instead
#estimator = RandomForestClassifier(n_estimators=100, max_depth=2, random_state=RANDOM_ST)
estimator = DecisionTreeClassifier(random_state=RANDOM_ST)
# Custom cv so I can seed with random state => results are comparable between different options later
rskv = model_selection.RepeatedStratifiedKFold(n_splits=5, n_repeats=5, random_state=RANDOM_ST)
# Greedy Feature Selection
rfecv= RFECV(estimator, cv=rskv, n_jobs=-1)
rfecv.fit(train, train_labels['gname_code'])
# optimal number of features
print('Optimal no. of features is: ', rfecv.n_features_)
# drop the un-informative features
train = train.iloc[:, rfecv.support_]
test = test.iloc[:, rfecv.support_]
# test again now
clf = KNeighborsClassifier()
clf = clf.fit(train, train_labels['gname_code'])
preds = clf.predict(test)
score = metrics.accuracy_score(preds, test_labels['gname_code'])
print ('Result after feature selection: ', score)
return train, train_labels, test, test_labels
(我知道对此有很多类似的问题,但是我找不到任何对我有帮助的东西)
答案 0 :(得分:1)
为您创建了一些方法来实现此目的。 我建议阅读Tk文档(文本,Text.search(),标签,索引)!
Tk为您提供了text.search方法,因此您无需实现自己的方法。 Tk Text小部件为您提供了可以创建和修改标签的标签。
工作流程:
1.使用text.search()方法搜索模式
将返回起始位置的索引
2.使用text.tag_config()
创建一个标签
3.使用text.tag_add()
from tkinter import Tk, Entry, Button, Text, IntVar
from tkinter import font
class Text_tag_example():
def __init__(self, master):
self.master = master
self.my_font = font.Font(family="Helvetica",size=18)
self.startindex = "1.0" #needed for search method, index ("line, column")
self.endindex = "end" #needed for search method, index (end of index)
self.init_widgets()
def init_widgets(self):
self.txt_widget = Text(self.master, font=self.my_font,
height=10, width=40)
self.txt_widget.grid(row=0, columnspan=2)
self.ent_string = Entry(self.master, font=self.my_font)
self.ent_string.grid(row=1, column=0)
self.but_search = Button(self.master, text="Search", font=self.my_font,
command=self.search_word)
self.but_search.grid(row=1, column=1)
def search_word(self):
word = self.ent_string.get() #get string from entry
countVar = IntVar() # contain the number of chars that matched
searched_position = self.txt_widget.search(pattern=word, index=self.startindex,
stopindex=self.endindex, count=countVar)
self.txt_widget.tag_config("a", foreground="blue", underline=1)
endindex = "{}+{}c".format(searched_position, countVar.get()) #add index+length of word/pattern
self.txt_widget.tag_add("a", searched_position, endindex)
if __name__ == "__main__":
root = Tk()
app = Text_tag_example(root)
root.mainloop()
用法:
类型的文本小部件“你好,再见”
类型的输入小部件“ hi”
按搜索按钮
-“ hi”应为蓝色并带有下划线
下一个问题可能是“如何在文本中标记所有相同的单词?”
再次阅读文档,否则您将无法理解它!