计算段落中的句子数量

时间:2015-03-20 12:42:09

标签: python

嗨我很困惑阅读有关计算句子和单词的所有主题,我不想打开任何文件,我只想计算字符串中单词和句子的数量。我完成了计数这个词,我很高兴,我只是不知道从哪里开始。这是我到目前为止所拥有的。

    import re
    line = (" A Turing machine is a device that manipulates "
            "symbols on a strip of tape according to a table "
            "of rules. Despite its simplicity, a Turing machine "
            "can be adapted to simulate the logic of any computer "
            "algorithm, and is particularly useful in explaining "
            "the functions of a CPU inside a computer. The 'Turing'"
            " machine was described by Alan Turing in 1936, who "
            "called it an""a(utomatic)-machine"". The Turing "
            "machine is not intended as a practical computing "
            "technology, but rather as a hypothetical device "
            "representing a computing machine. Turing machines "
            "help computer scientists understandthe limits of "
            "mechanical computation.")
    print (line)
    print ()
    count = len(re.findall(r'\w+', line))
    print ("The number of words in this paragraph:", count)

单词计数出现98,这是完美的,我知道段落中有错误,但它们是故意的,所以我知道它的工作正常。我现在想要计算应该是5的句子数,但我不确定如何。任何帮助将不胜感激。

1 个答案:

答案 0 :(得分:2)

如果你想依赖那些作为你的句子分隔符,你可以计算字符串中的句点数。

line.count('.')

或者使用正则表达式,就像你正在做的那样:

len(re.findall(r'\.', line)