对于作业,我需要找到文本文件的句子数(不是行)。这意味着在字符串的末尾我会有'。'要么 '!'要么 '?'。经过艰苦的努力,我写了一个代码,这是一个错误。我没有看到任何错误。如果有人可以帮助我,那将非常感激。感谢
这是我的代码
fh1 = fopen(nameEssay); %nameEssay is a string of the name of the file with .txt
line1 = fgetl(fh1);
%line1给出了文章的标题。这不算作句子
essay = [];
line = ' ';
while ischar(line)
line =fgetl(fh1);
essay = [essay line];
%creates a long string of the whole essay
end
sentenceCount=0;
allScore = [ ];
[sentence essay] = strtok(essay, '.?!');
while ~isempty(sentence)
sentenceCount = sentenceCount + 1;
sentence = [sentence essay(1)];
essay= essay(3:end); %(1st character is a punctuation. 2nd is a space.)
while ~isempty(essay)
[sentence essay] = strtok(essay, '.?!');
end
end
fclose(fh1);
答案 0 :(得分:3)
如果你计算的数量,基于'。'或者'!'或者'?',你可以在essey中计算这些字符的数量。因此,如果论文是包含字符的数组,您可以这样做:
essay = 'Some sentece. Sentec 2! Sentece 3? Sentece 4.';
% count number of '.' or '!' or '?' in essey.
sum(essay == abs('.'))
sum(essay == abs('?'))
sum(essay == abs('!'))
% gives, 2, 1, 1. Thus there are 4 sentences in the example.
如果您想要参与,可以使用Dan建议的strsplit,例如
[C, matches] = strsplit(essay,{'.','?', '!'}, 'CollapseDelimiters',true)
% gives
C =
'Some sentece' ' Sentec 2' ' Sentece 3' ' Sentece 4' ''
matches =
'.' '!' '?' '.'
并计算匹配中的元素数量。对于示例,最后一个元素为空。它可以很容易地过滤掉。
答案 1 :(得分:3)
regexp
处理得很好:
>> essay = 'First sentence. Second one? Third! Last one.'
essay =
First sentence. Second one? Third! Last one.
>> sentences = regexp(essay,'\S.*?[\.\!\?]','match')
sentences =
'First sentence.' 'Second one?' 'Third!' 'Last one.'
在模式'\S.*?[\.\!\?]'
中,\S
表示句子以非空格字符开头,.*?
匹配任意数量的字符(非贪婪),直到标点符号为止遇到句子的结尾([\.\!\?]
)。