我正在尝试使用Naive Bayes分类器在Python中进行文本分类,并且当有两个标签“ negative”和“ Positive”作为结果时,它可以很好地工作。我有大约300个句子的样本数据,并且我想基于Moore Bygrave模型对其进行标记,因此从本质上讲,我想将这300个句子映射到大约10个标签。因此,当我训练模型并尝试预测答案时,总是只有一个标签“承诺”。我很困惑,任何人都可以指导并给我一些有关如何使用Python处理复杂文本分类问题的参考。
代码如下所示:
6.2.9.389
from textblob.classifiers import NaiveBayesClassifier
from textblob import TextBlob
train = [
('Less chance of success','Achievement'),
('It comes with a lot of paperwork that can take up time and energy','Experience'),
('High competition in the market','Risk taking'),
('It can be lonely and scary to be completely responsible','Risk taking'),
('If business slows down, your personal income can be at risk','Risk taking'),
('it requires more work and longer hours than being an employee','Experience'),
('your college degrees won’t help you to succeed in business','Experience'),
('Not willing to take risks','Risk taking'),
('It does not guarantee percent success','Achievement'),
('Fear of failure','Risk taking'),
('Chances of low profit','Economy'),
('Hurdles in reaching potential customer','Achievement'),
('Profit maximization doesn’t guarantee pursuit of one’s interests','Personal Values'),
('Unwanted feedback from family','Family'),
('Not caring about customers in the long run is contrary to my ethical standards','Personal Values'),
('Not getting appropriate recognition soon','Achievement'),
('Lack of agility in personality','Commitment'),
('Difficult to save money efficiently','Economy'),
('Requires good money management skills which one doesn’t develop easily','Experience'),
('Hassle of legal work','Lawyers'),
('Fear of being sued','Lawyers'),
('Hate to delegate tasks','Manager'),
('Tendency to make same mistakes','Locus of Control'),
('Hate taking risks','Risk taking'),
('Not persistent enough to keep going despite failures initially','Commitment'),
('Sounds like a boring idea','Personal Values'),
('Too much hard work','Commitment'),
('Time management is difficult','Manager'),
('Lack of resources','Resources'),
('Raising capital is a difficult task','Resources'),
('Afraid of law suits','Lawyers'),
('No clear goal in mind','Locus of Controly'),
('Lack of accountability upon which is bad for self-assessment','Competitors'),
('Marketing is expensive','Economy'),
('Target customers are hard to reach','Customers'),
('Return is uncertain','Risk Taking'),
('Fear of rejection','Achievment'),
('Struggling in the market','Strategy'),
('Strategy formulation is a strenuous task','Negative'),
('Failing a business is ugly','Risk taking'),
('Fear of ending behind bars','Risk taking'),
('Feels secluded from the world','Personal Vlues'),
('Managing everything alone is a headache','Manager'),
('Supervision is a difficult job','Manager'),
('No more chit chatting with co employees','Team'),
('No flexible timings','Locus of Control'),
('Lack of motivation','Commitment'),
('Pricing is a headache','Economy'),
('High tax rates','Economy'),
('No pension funds','Economy'),
('Miss gratuity benefits','Economy'),
('No Provident fund is too disadvantageous','Economy'),
('I feel bored','Personal Values'),
('Too many ups and downs','Experience'),
('So stressful','Personal Values'),
('Uncertain income','Rist taking'),
('Way too much hard work and investment initially','Risk taking'),
('Might make one lose sight of his/her passions','Vision'),
('No time for family','Family'),
('Goodbye to hobbies','Resources'),
('Lack of structure','Structure'),
('Can’t find meaning in it','Vision'),
('Competition is terrific','Competitors'),
('Wouldn’t prefer to invest life long savings in something risky/','Risk taking'),
('Deal with unending collections every day','Resources'),
('Partners can back out','Team'),
('Promises get broken','Vision'),
('The sole purpose of businesses-profit maximization- doesn’t care about people','Personal Values'),
('Hours can be a killer','Commitment'),
('It is unethical','Personal Values'),
('Regulations are a headache','Government Policy'),
('Government is troublesome','Government Policy'),
('I require a peaceful life','Personal Values'),
('Unending paperwork','Resources'),
('Not creative enough','Creativity'),
('Responsibility is just too much','Leader'),
('Doesn’t make me happy','Job dissatisfaction'),
('Dislike power','Leader'),
('Less days off','Commitment'),
('Its shallow work','Personal Values'),
('Customers ultimately leave','Customers'),
('It fails oftentimes since innovation is not an easy task','Creativity'),
('Can’t deal with difficult clients','Customers'),
('Taking a giant financial risk','Risk taking'),
('Fear of ruining resume in case of failure','Risk taking'),
('Depressing work','Personal Values'),
('Uncertain environment','Culture'),
('Cruel competitors','Competitors'),
('Lack of trust on people','Team'),
('Fewer vacations','Commitment'),
('Legal actions can make one stay awake at night','Lawyers'),
('Can’t call in sick','Commitment'),
('Desire a steady income','Resouces'),
('Can’t follow ‘Customer is always right’','Customers'),
('Fear of failure remains there always','Achievement'),
('There is no sick pay','Commitment'),
('Think about it ','Vision'),
('Family tensions may arise in case of family business','Family'),
('Unneeded advices are a headache','Advisors'),
('Last to get paid','Economy'),
('risk of loss','Risk taking'),
('almost whole day work initially people usually give','Commitment'),
('motivating employees and getting new once can be hectic','Team'),
('lazy personality','Commitment'),
('if business doesnt go well there is loss of money and time','Achievement'),
('you need to learn first about business related tools like marketing and all','Negative'),
('you should have links which can help you out or else its v v difficult','Networks'),
('it takes time to establish business if you dont have time or smth then its useless','Commitment'),
('coming up with new and creative idea can be hectic and if a idea is not good business will be in loss','Creativity'),
('people can be difficult to deal with','Team'),
('Strenuous work','Commitment'),
('High risk','Risk taking'),
('Will need a large sum of money to invest','Economy'),
('If the business incurs loss I alone will have to make compensations','Economy'),
('Irregular salary because I ll be making payments to my employees first','Economy'),
('Involves responsibility','Leader'),
('I dont have any prior knowledge or experience about running a business','Experience'),
('Will have to face immense competition during initial years','Competitors'),
('Makes one less disciplined','Locus of Control'),
('Lack of structure in work','Structure'),
('Doesn’t suit my personality','Personal Values'),
('No fixed work timings','Commitment'),
('staying late night at office','Commitment'),
('Must prioritize it','Manager'),
('I can be lazy sometimes which will impact my business performance','Commitment'),
('In Pakistan business individuals are taxed higher than salaried individuals','Government Policy'),
('In Pakistan businessmen often find themselves in toght spots where they often mo option but to deal with corrupt individuals in order to continue their business','Teams'),
('You may not have the precise leadership skills that are required','Experience'),
('Startups often tend to ask for more time and attention than do jobs','Commitment'),
('Finances may be hard to come by','Resources'),
('You may not be able to give time to your family and friends','Family'),
('You may not have the relevant organization skills needed for business execution','Experience'),
('It may sound like a wonderful idea to put together your own team, but finding the right people may be difficult','Team'),
('Boring work involved','Personal Values'),
('The employees who suit you may not want to work for you due to the lower pay, fringe benefits, or due to your business having a lack of a brand name','Economy'),
('The stress is great, and you may not find all the stress and hard work worth it if you don’t generate enough profits or satisfaction from your organization','Job Dissatisfaction'),
('Sure/ Stable/ Risk less Income','Oppurtunity Recognition '),
('Higher tax bracket','Economy'),
('Promise to receive pension after retirement','Government Policy'),
('No ownership or responsibility of ownership ie no need to worry about performance of company','Entreprenuer'),
('Want a tension free life','Personal Values'),
('Free Medical Treatment offered by most companies','Resources'),
('It will stress me out','Commitment'),
('You take all the risks','Risk taking'),
('When there is a loss you have to bare all of it','Risk taking'),
('You need to be the most responsible person on the ground','Leader'),
('Your decisions are vital and can affect the whole business, so wrong decisions at times can affect alot','Leader'),
('You also need to think about all your subordinates so mostly your decisions are influencedd by others','Leader'),
('You need to pay for all the expenses and all the taxes','Economy'),
('You are answerable to the government for any misconduct or illegal activity of your business','Government Policy'),
('Never prefer taking risks','Risk taking'),
('Other opportunities','Oppurtunities'),
('Always need a certain lead to follow','Role Models'),
('Think i can capitalise on my capablities in services sector','Experience'),
('Can work from home -','Opportunities'),
('Need a hard separation between my private life and working life','Personal Values'),
('Think there are high tax rates in Pakistan','Government Policy'),
('Belong to family which is least interested in doing business','Family'),
('Dont have initial amount to invest','Resources'),
('Am never best suited for running my own business','Entreprenuer'),
('Think true competence is in the job market','Competitors'),
('Think a business startup requires 100 percent dedication','Commitment'),
('I Cant dedicate my all the time and effort to a single cause','Commitment'),
('salary is not fixed','Economy'),
('Chances of making losses','Risk taking'),
('Too much burden','Commitment'),
('Rivalry/competition','Competitors'),
('Fear of not being able to manage well','Manager'),
('Extortion money threats','Government Policy'),
('Want to give more time to family','Family'),
('Business requires startup capital','Resources'),
('Complications of documents and stuff','Resources'),
('I have other better options','Personal Values '),
('vast responsibility','Leader'),
('All the stake is yours, one wrong decision and you are gone','Leader'),
('Availability at all times','Commitment'),
('Fewer vacations','Personal Values'),
('I want fixed salary','Economy'),
('Hate working all the time','Commitment'),
('It is exciting, each day is filled with new opportunities','Oppurtunites'),
('I can schedule my work hours around other commitments','Commitment'),
('It allows one to set his/her own earnings','Economy'),
('You choose the work you like to do and that makes the most of your strengths and skills','Experience'),
('It gives a great amount of freedom','Entreprenuer'),
('There is room to implement one’s own ideas','Creativity'),
('high job satisfaction','Achievement'),
('You can pursue your passion','Vision'),
('You can Watch your organization grow from start to finish','Achievement'),
('You decide who to hire and bring into your company','Leader'),
('It gives the opportunity to network with other entrepreneurs and professionals','Networks'),
('Through business, one can help people by preparing products or giving services to improve their life','Products'),
('Sell how you want to sell','Vision'),
('It helps discover new perspective and approaches','Opportunities'),
('connection with clients','Networks'),
('development of skills','Experience'),
('Develop new skill set','Experience'),
('feeling of self-satisfaction','Achievement'),
('Control over destiny','Vision'),
('Easy work','Commitment'),
('Area for innovation','Creativiy'),
('People at work become family like','Culture'),
('No dress codes','Culture'),
('Opportunity to change lives','Oppurtunities'),
('Control over workspace','Structure'),
('Shot of adrenaline after reaching a goal','Achievement'),
('Becoming a role model for others through success in business ventures','Role Models'),
('No age barrier','Culture'),
('Freedom to travel','Resources'),
('Less boredom','Personal Values'),
('Good utility of intelligence','Creativity'),
('Going cubicle free','Leader'),
('Pride in calling oneself a business owner','Entreprenuer'),
('No feeling of worthlessness','Achievement'),
('It helps create something from nothing','Opportunity Recognition'),
('It makes things happen!','Opportunity Recognition'),
('Adjust schedules and spend more time with friends and family','Family'),
('Output aligns with input','Commitment'),
('Involving family in work is a lovely idea','Family'),
('Give back to society','Opportunity recognition'),
('It helps become healthy both mentally and physically','Achievement'),
('Gives Free time which can be invested in hobbies','Leader'),
('It doesn’t make you frantically check the time','Entreprenuer'),
('It fulfills thrill for challenges','Risk taking'),
('There is no reporting to a boss','Leader'),
('No repetition in work','Personal Values'),
('Business provides room to utilize your creative skills','Creativity'),
('Meeting brilliant minds','Role Models'),
('Chance to create a legacy','Opportunities'),
('Life requires flexibility','Vision'),
('It is fascinating','Opportunities'),
('It helps make people happy','Teams'),
('Unlimited room for growth','Structure'),
('Earn doing what you love','Personal Values'),
('Its great to feel appreciated','Opportunity Recognition'),
('Build own security','Leader'),
('Learn constantly','Education'),
('Create your own job','Entreprenuer'),
('No stress of being fired','Leader'),
('Bad days aren’t as bad as with jobs','Job Dissatisfaction'),
('No requirement of degree or qualifications','Education'),
('No boundaries','Opportunities'),
('Satisfy curiosity','Opportunity Recognition'),
('An end to boring meetings','Team'),
('Media acknowledgement/fame','Vision'),
('Feeling of becoming a provider','Vision'),
('The opportunity to create a corporate culture','Culture'),
('Experience personal growth','Leader'),
('Become expert in problem solving','Entreprenuer'),
('No more boring work feels','Personal Values'),
('Endless experiences','Opportunities'),
('Dream big, achieve big','Economy'),
('Compete with yourself','Competitors'),
('Bad luck is not a thing here','Commitment'),
('Gives a happy feeling','Achievement'),
('It is adventurous','Risk taking'),
('Respect in society','Family'),
('Name in market','Networks'),
('Develops interpersonal skills','Experience'),
('Recognition feels great','Networks'),
('Satisfies passion for learning','Education'),
('Appreciation for unconventional ideas','Creativity'),
('Can provide ways to change the world','Economy'),
('Unlimited earning possibilities','Opportunties'),
('Family pressure','Family'),
('Leave an impact for generations','Family'),
('Locate your office wherever you want','Structure'),
('No more bossy heads','Leader'),
('Thrill for risks!','Risk taking'),
('No other choice','Job Loss'),
('Admire this work','Personal Values'),
('It is cool','Personal Values'),
('Peer pressure','Networks'),
('Drive to do something different','Creativity'),
('Defining the work objectives','Structure'),
('Opportunity to implement creativity','Creativity'),
('Greater autonomy','Leader'),
('Getting interviewed sounds exciting','Opportunities'),
('Change the market','Vision'),
('Sense of accomplishment','Achievement'),
('Stress-free life after some years','Commitment'),
('The idea of freedom is appealing','Leader'),
('More spare time','Family'),
('Story to tell people','Networks'),
('Tax benefits through expenses','Economy'),
('Pride in creating something','Creativity')]
cl = NaiveBayesClassifier(train)
# Classify some text
print(cl.classify('Need no capital to implement ideas')) #"Resources"
print(cl.classify("Keep experiencing new things.")) # "Creativity"
print(cl.classify("Money is good.")) # "Resources"
print(cl.classify("Less strenuous")) # "Personal Values"
如果这些都不是“承诺”的标签,请提供有关我应该如何修正代码的建议,以便实现正确的文本分类,请提供帮助。谢谢。
答案 0 :(得分:2)
问题不在代码中,而在数据中。您有很多课程(48),只有285个培训示例。其中,承诺有28个示例。这就是该模型为该课程分配高分的原因。
添加更多示例,还尝试实现数据平衡(每个标签的示例数量大致相等)。您也可以尝试删除停用词。