计算python中有多个元音的单词和行的数量

时间:2019-05-24 18:36:16

标签: python string optimization split substring

我正在解决这个编码问题“计算每Z行中每个Y单词的元音数目超过X个元音”

基本上,输入字符串有多行,我需要计算其中有X个或更多元音的单词。但是约束是我只需要考虑替代的第Z行以及那些替代的第Z行。例如假设我需要计算每3行中有2个或更多元音的第3个单词。所以这里X=2Y=3Z=3。检查下面的输入字符串:

"1.When I first brought my cat home.
 2.It cost a lot to adopt her.
 3.I paid forty dollars for it.
 4.And then I had to buy litter, a litterbox.
 5.Also bought food, and dishes for her to eat out of. 
 6.There's a **leash** law for cats in Fort **Collins**.
 7.If they're not in your yard they have to be on a leash. 
 8.Anyway, my cat is my best friend. 
 9.I'm glad I got her. 
 10.She sleeps under the covers with me when it's cold."

输出:字数:2,行数:1

因此基于Z=3的标准,即每3行计数一次,因此要考虑的行为line number 3, 6, 9。同样在这些行中,我们需要计算Y=3 i.e. every 3rd word。因此要考虑的单词是"forty, it" from line 3"leash, cats, Collins" from line 6"I" from line 9。在此条件下,只有在第6行中包含单词"leash""Collins"才能找到具有2个或更多元音的匹配单词,因此输出为WordCount = 2和LineCount = 1。

这是我第一次用Python编写任何东西,因此编写了以下基本代码:

class StringCount:  #Count the number of words & lines that have more than X vowels for every Y words in every Z line. 

    lines = list();
    totalMatchedLines = 0;
    totalMatchedWords = 0;
    matchedChars = 0;

    def __init__(self, inputString, vowelCount, skipWords, skipLines, wordDelimiter, lineDelimiter):
      self.inputString = inputString;
      self.vowelCount = vowelCount;
      self.skipWords = skipWords;
      self.skipLines = skipLines;
      self.wordDelimiter = wordDelimiter;

    def splitLines(self):
      if self.inputString.strip() == "":
        print ("Please enter a valid string!");
        return False;      
      self.lines = self.inputString.splitlines();

    def splitWords(self):      
      self.matchedWords = 0;
      self.matchedLines = 0;
      self.linesLength = len(self.lines);

      if self.linesLength < self.skipLines:
        print ("Input string should be greater than {0}" .format(self.skipLines));
        return False;

      lineCount = self.skipLines - 1;
      wordCount = self.skipWords - 1;
      lineInUse = "";
      words = list();

      while (lineCount < self.linesLength):
        self.matchedWords = 0;
        self.matchedLines = 0;
        self.words = self.lines[lineCount].split();
        self.wordsLength = len(self.words);
        wordCount = self.skipWords - 1;

        while (wordCount < self.wordsLength):
          self.matchedChars = 0;       
          for i in self.words[wordCount].lower():
            if(i=='a' or i=='e' or i=='i' or i=='o' or i=='u'):
              self.matchedChars += 1;              
          if self.matchedChars >= self.vowelCount:
            self.matchedWords += 1;
          wordCount += self.skipWords;

        if self.matchedWords > 0:
          self.matchedLines += 1;

        self.totalMatchedWords += self.matchedWords;
        self.totalMatchedLines += self.matchedLines;
        lineCount += self.skipLines;

      print ("WordCount = %s" % (self.totalMatchedWords));
      print ("LineCount = %s" % (self.totalMatchedLines));

由于这是我的第一个Python代码,因此我想检查一下如何在性能和线路优化方面优化此代码。是否有任何技巧可以缩短多个while循环和for循环?

感谢您的帮助!

1 个答案:

答案 0 :(得分:0)

使用转换表可以更加有效。如果将每个单词中的所有元音都转换为字母“ a”,则可以使用count()方法对元音进行计数。剩下的只是遍历行列表和单词列表的问题,您可以使用列表推导和范围索引:

vowels     = str.maketrans("aeiouAEIOU","aaaaaaaaaa")
X,Y,Z      = 2,3,3
counts     = [sum(word.translate(vowels).count("a")>=X for word in line.split(" ")[::Y]) for line in lines[Z-1::Z]]
lineCount  = sum(count>0 for count in counts)
wordCount  = sum(counts)

print(lineCount,wordCount) # 1 2