使用Python从txt文件中删除字符

时间:2014-03-31 19:55:19

标签: python string character

我正在python中编写一个程序,要求用户输入文件名,打开文件,计算M和F的数量,并将其作为比率计算。我可以做到这一点,并删除空格,但我无法弄清楚如何删除不是M或F的字符。我想删除所有不正确的字符并将它们写入新文件。这是我到目前为止所拥有的

fname = raw_input('Please enter the file name: ')  #Requests input from user
try:                                                #Makes sure the file input     is valid
   fhand = open(fname)
except:
   print 'Error. Invalid file name entered.'
   exit()
else:
  fhand = open(fname, 'r')            #opens the file for reading

  entireFile = fhand.read()           
  fhand.close()
  entireFile.split()           #Removes whitespace
  ''.join(entireFile)         #Rejoins the characters

  entireFile = entireFile.upper() #Converts all characters to capitals letters

  males = entireFile.count('M')
  print males
  females = entireFile.count('F')
  print females
  males = float(males)
  females = float(females)
  length = males + females
  print length
  length = float(length)
  totalMales = (males / length) * 1
  totalFemales = (females / length) * 1

  print "There are %", totalMales, " males and %", totalFemales, " in the file."

3 个答案:

答案 0 :(得分:2)

最简单的方法是使用正则表达式:

import re
data = re.findall(r'[FM]', entirefile)

如果您使用r'[FMfm]',则不需要大写所有文件,正则表达式将捕获所有大写和小写。

这将返回F'sM's所有内容,而无需删除white spaces

示例:

entirefile = "MLKMADG FKFLJKASDM LKMASDLKMADF MASDLDF"
data = ['M', 'M', 'F', 'F', 'M', 'M', 'M', 'F', 'M', 'F']

你可以用这个清单做任何你想做的事。

希望这会有所帮助。

答案 1 :(得分:1)

m,f,other = [],[],[]
for ch in entierFile:
    if ch == "M":m.append(ch)
    elif ch == "F":f.append(ch)  
    else: other.append(ch)

print len(m) + " Males, "+len(f)+" Females"
print "Other:",other

答案 2 :(得分:1)

使用正则表达式提取非M或F的所有字符:

import re
remainder = re.sub(r'M|F', '', entireFile)
with open('new_file', 'wb') as f:
    f.write(remainder)