我写了这段代码,它需要一个输入字符串来获取类似的单词 并创建这些单词的不同组合,在pandas列中搜索每个组合,并返回找到关键词的行的索引。
我在下面编写了代码并且对我来说效果很好,但它比我想要的慢,随着时间的推移,数据框越来越大,我猜测它只会变慢。
所以我想知道是否有更有效的方法可以遵循,我可以改变哪些线来实现这一目标。使用正则表达式搜索或附加列表。
这是我的数据框
Unnamed: 0 web-scraper-start-url course-link course-link-href title shortDescription instructor date language subtitle ... fullDescription requiremens includes objective audience instruct fullText full_text key_words clean_words
0 0 https://www.udemy.com/courses/business/all-cou... How To Create A 5 Figure SEO Business-ZERO Exp... https://www.udemy.com/how-to-create-a-5-figure... How To Create A 5 Figure SEO Business-ZERO Exp... Create a 5 figure SEO business by working for ... Angshuman Dutta Last updated 3/2017 English English [Auto-generated] ... This course will show you how to create a prof... You should be willing to profit from selling S... 2 hours on-demand video|2 Supplemental Resourc... Build a sustainable income selling SEO service... This course is for internet marketers who want... Angshuman-Dutta How To Create A 5 Figure SEO Business-ZERO Exp... ['create', 'figure', 'seo', 'business', 'zero'... ['freelance', 'experience', 'service', 'websit... ['income', 'resource', 'corporate', 'absolutel...
1 1 https://www.udemy.com/courses/business/all-cou... Microsoft Excel for Project Management - Earn... https://www.udemy.com/microsoft-excel-for-proj... Microsoft Excel for Project Management - Earn... Mastering Microsoft Excel for Project Manageme... Joseph Phillips Last updated 3/2016 English English [Auto-generated] ... Itâs been said that project management is 90... Basics of project management|Basics of Microso... 4.5 hours on-demand video|1 Supplemental Resou... Design reports for your stakeholders|Create a ... Project managers|PMPs|People learning Microsof... Joseph-Phillips Microsoft Excel for Project Management - Earn... ['microsoft', 'excel', 'project', 'management'... ['project', 'manager', 'excel', 'microsoft', '... ['project', 'resource', 'people', 'reporting',...
这就是我的数据框中的关键字列如何
0 [freelance, experience, service, website, free...
1 [project, manager, excel, microsoft, reporting...
2 [income, informational, english, online, exper...
这是我的代码。
def bla_bla(model):
input_string = input()
title = input_course.split()
titles = model.most_similar(title)
title_list = []
for keyword in titles:
titles_list.append(keyword[0])
recommended_keywords = titles_list + title
#This is how recommended key_words will look like
['fullstack',
'ror',
'tulsa',
'shrikrishna',
'vanston',
'devtools',
'develoeprs',
'frontend',
'intermidate',
'nunn',
'web',
'developer']
coursat = []
for duo in range(0, len(recommended_keywords)+1):
for subset in itertools.combinations(recommended_keywords, duo):
if len(subset) > 2 and len(subset)<=3:
coursat.append(subset)
else:
pass
my_list = []
for g in coursat:
y = df[df['key_words'].str.contains(".*"+str(g[0])+".*"+str(g[1])+"|"+".*"+str(g[1])+".*"+str(g[0]))]
if y.title.empty:
pass
else: my_list.append(y.title)
return my_list
这应该是我的功能的输出。
[2538 Node with React: Fullstack Web Development
Name: title, dtype: object,
2481 Progressive Web Apps (PWA) - The Complete Guide
3447 Progressive Web Apps - The Concise PWA Masterc...
4964 Progressive Web Apps (PWA) - From Beginner to ...
Name: title, dtype: object,
5691 Yii2 Application Development Solutions–Volume 2
Name: title, dtype: object,
3697 HTML5 : Mobile Web App Development
Name: title, dtype: object]
提前致谢。