我正在寻找可以从一段文字中提取重要关键字的网络服务。
我已经尝试过Yahoo Term Extraction服务。这项服务的问题在于它没有为短文本提供任何结果。
或者,我可以使用任何可以使用的代码,它可以从一段文本中提取重要的关键字。即删除字符串中的所有通用词。
例如:
“我想买一台数码相机”
术语:“数字”,“相机”
感谢。
还有另外两个相关的堆栈溢出问题,并且有更多信息:
What is a simple way to generate keywords from a text? Filter out common words for search query
答案 0 :(得分:4)
您可能希望查看www.opencalais.com(与路透社相关)是一个提供的网络服务
您的文字“我想购买数码相机”会返回此rdf / xml文档 -
<!--Use of the Calais Web Service is governed by the Terms of Service located at http://www.opencalais.com. By using this service or the results of the service you agree to these terms of service.-->
<!--Relations: GenericRelations
Technology: digital camera-->
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:c="http://s.opencalais.com/1/pred/">
<rdf:Description c:allowDistribution="true" c:allowSearch="true" c:calaisRequestID="1ef6064f-283c-4fd4-a922-0ff493c4353a" c:externalID="calaisbridge" c:id="http://id.opencalais.com/SLlKCS2i2mZA3ABrQS0F9Q" rdf:about="http://d.opencalais.com/dochash-1/97cdaf47-fa15-31a1-be2b-3be1184d412a">
<rdf:type rdf:resource="http://s.opencalais.com/1/type/sys/DocInfo" />
<c:document>
<![CDATA[<Document>
<Date>2009-04-03</Date>
<Body>I want to buy a digital camera</Body>
</Document>]]>
</c:document>
<c:docTitle />
<c:docDate>2009-04-03 00:00:00</c:docDate>
<c:externalMetadata c:caller="calaisbridge" />
<c:submitter>calaisbridge</c:submitter>
</rdf:Description>
<rdf:Description c:contentType="text/txt" c:emVer="UserVocabulariesIM" c:langIdVer="DefaultLangId" c:language="InputTextTooShort" c:processingVer="CalaisJob01" c:submissionDate="2009-04-03 14:14:42.532" rdf:about="http://d.opencalais.com/dochash-1/97cdaf47-fa15-31a1-be2b-3be1184d412a/meta">
<rdf:type rdf:resource="http://s.opencalais.com/1/type/sys/DocInfoMeta" />
<c:docId rdf:resource="http://d.opencalais.com/dochash-1/97cdaf47-fa15-31a1-be2b-3be1184d412a" />
<c:submitterCode>416dcd8a-766f-0aa3-d94c-e5034b6ffc98</c:submitterCode>
<c:signature>digestalg-1|sUmdk2pKaXLrsD0b2sNfX5dPvW4=|e+F5sMjqxqj0Qi+efzdG5D2s1TKBM//zH+NI1MNYvugY3FS9e3xP6g==</c:signature>
</rdf:Description>
<rdf:Description rdf:about="http://d.opencalais.com/dochash-1/97cdaf47-fa15-31a1-be2b-3be1184d412a/lid/DefaultLangId">
<rdf:type rdf:resource="http://s.opencalais.com/1/type/lid/DefaultLangId" />
<c:docId rdf:resource="http://d.opencalais.com/dochash-1/97cdaf47-fa15-31a1-be2b-3be1184d412a" />
<c:lang rdf:resource="http://d.opencalais.com/lid/DefaultLangId/InputTextTooShort" />
</rdf:Description>
<rdf:Description rdf:about="http://d.opencalais.com/genericHasher-1/e224e552-7ebd-3ed1-aaa4-f8aba30331c2">
<rdf:type rdf:resource="http://s.opencalais.com/1/type/em/e/Technology" />
<c:name>digital camera</c:name>
</rdf:Description>
<rdf:Description rdf:about="http://d.opencalais.com/dochash-1/97cdaf47-fa15-31a1-be2b-3be1184d412a/Instance/1">
<rdf:type rdf:resource="http://s.opencalais.com/1/type/sys/InstanceInfo" />
<c:docId rdf:resource="http://d.opencalais.com/dochash-1/97cdaf47-fa15-31a1-be2b-3be1184d412a" />
<c:subject rdf:resource="http://d.opencalais.com/genericHasher-1/e224e552-7ebd-3ed1-aaa4-f8aba30331c2" />
<!--Technology: digital camera; -->
<c:detection>[ment><Date>2009-04-03</Date><Body>I want to buy a ]digital camera[</Body></Document>]</c:detection>
<c:prefix>ment><Date>2009-04-03</Date><Body>I want to buy a </c:prefix>
<c:exact>digital camera</c:exact>
<c:suffix></Body></Document></c:suffix>
<c:offset>55</c:offset>
<c:length>14</c:length>
</rdf:Description>
<rdf:Description rdf:about="http://d.opencalais.com/dochash-1/97cdaf47-fa15-31a1-be2b-3be1184d412a/Relevance/1">
<rdf:type rdf:resource="http://s.opencalais.com/1/type/sys/RelevanceInfo" />
<c:docId rdf:resource="http://d.opencalais.com/dochash-1/97cdaf47-fa15-31a1-be2b-3be1184d412a" />
<c:subject rdf:resource="http://d.opencalais.com/genericHasher-1/e224e552-7ebd-3ed1-aaa4-f8aba30331c2" />
<c:relevance>0.857</c:relevance>
</rdf:Description>
<rdf:Description rdf:about="http://d.opencalais.com/genericHasher-1/e8eac39c-f280-331e-9ccd-07f740d46ddb">
<rdf:type rdf:resource="http://s.opencalais.com/1/type/em/r/GenericRelations" />
<c:verb>buy</c:verb>
<c:relationsubject>I</c:relationsubject>
<!--digital camera-->
<c:relationobject rdf:resource="http://d.opencalais.com/genericHasher-1/e224e552-7ebd-3ed1-aaa4-f8aba30331c2" />
</rdf:Description>
<rdf:Description rdf:about="http://d.opencalais.com/dochash-1/97cdaf47-fa15-31a1-be2b-3be1184d412a/Instance/2">
<rdf:type rdf:resource="http://s.opencalais.com/1/type/sys/InstanceInfo" />
<c:docId rdf:resource="http://d.opencalais.com/dochash-1/97cdaf47-fa15-31a1-be2b-3be1184d412a" />
<c:subject rdf:resource="http://d.opencalais.com/genericHasher-1/e8eac39c-f280-331e-9ccd-07f740d46ddb" />
<!--GenericRelations: verb: buy; relationsubject: I; relationobject: digital camera; -->
<c:detection>[<Document><Date>2009-04-03</Date><Body>]I want to buy a digital camera[</Body></Document>]</c:detection>
<c:prefix><Document><Date>2009-04-03</Date><Body></c:prefix>
<c:exact>I want to buy a digital camera</c:exact>
<c:suffix></Body></Document></c:suffix>
<c:offset>39</c:offset>
<c:length>30</c:length>
</rdf:Description>
</rdf:RDF>
答案 1 :(得分:1)
我知道有些人在使用WordsFinder服务方面取得了一些成功。