所以,我从Nigel Garvey的this code找到了here,我想添加一个类似于set wordsToIgnore to {"and", "the", "a", "for", "in", "is"}
的忽略列表。问题是,当谈到这些事情时,我通常不称职。能够怜悯的权力,并告诉我如何添加忽略列表?我尝试了各种类型的频率计数,但是这一次在文本编辑中给出了正确的样式输出,并且能够将输出的单词减少到给定的数字,但是缺乏忽略某些单词的能力。最好的问候。
编辑:我之前发过类似标签的帖子,但因为我正在使用不同的脚本,我认为最好开始一个新帖子。如果我做错了我的道歉。
答案 0 :(得分:0)
我没有对此进行测试,但经过快速浏览后,我的想法就是这样。将“on main(pdfFile)”处理程序中的此部分更改为以下内容...
-- Go through the sorted list, counting the instances of each word. Store each word and its score in a list in the 'scores' list in the script object above.
set wordsToIgnore to {"and", "the", "a", "for", "in", "is"}
set currentWord to item 1 of o's wrds
set c to 1
repeat with i from 2 to (count o's wrds)
set thisWord to item i of o's wrds
if thisWord is not in wordsToIgnore then
if (thisWord is currentWord) then
set c to c + 1
else
set end of o's scores to {currentWord, c}
set currentWord to thisWord
set c to 1
end if
end if
end repeat
set end of o's scores to {currentWord, c}