请帮助我!
我正在将包含多行的文本文件转换为pig latin。
示例:Pig Latin翻译:这是一个例子。应该是:Histay siay naay xampleeay。
我需要留下任何标点符号(大多数情况下句末) 我还需要任何以原始大写字母开头的单词,以猪拉丁语版本的大写字母开头,其余字母小写。
这是我的代码:
def main():
fileName= input('Please enter the file name: ')
validate_file(fileName)
newWords= convert_file(fileName)
print(newWords)
def validate_file(fileName):
try:
inputFile= open(fileName, 'r')
inputFile.close()
except IOError:
print('File not found.')
def convert_file(fileName):
inputFile= open(fileName, 'r')
line_string= [line.split() for line in inputFile]
for line in line_string:
for word in line:
endString= str(word[1:])
them=endString, str(word[0:1]), 'ay'
newWords="".join(them)
return newWords
我的文字文件是:
This is an example.
My name is Kara!
程序返回:
Please enter the file name: piglatin tester.py
hisTay
siay
naay
xample.eay
yMay
amenay
siay
ara!Kay
None
如何让它们按照它们所在的行打印出来?还有我如何处理标点问题和大写?
答案 0 :(得分:1)
以下是我对代码的修改。您应该考虑使用nltk。它具有更强大的单词标记化处理能力。
def main():
fileName= raw_input('Please enter the file name: ')
validate_file(fileName)
new_lines = convert_file(fileName)
for line in new_lines:
print line
def validate_file(fileName):
try:
inputFile= open(fileName, 'r')
inputFile.close()
except IOError:
print('File not found.')
def strip_punctuation(line):
punctuation = ''
line = line.strip()
if len(line)>0:
if line[-1] in ('.','!','?'):
punctuation = line[-1]
line = line[:-1]
return line, punctuation
def convert_file(fileName):
inputFile= open(fileName, 'r')
converted_lines = []
for line in inputFile:
line, punctuation = strip_punctuation(line)
line = line.split()
new_words = []
for word in line:
endString= str(word[1:])
them=endString, str(word[0:1]), 'ay'
new_word="".join(them)
new_words.append(new_word)
new_sentence = ' '.join(new_words)
new_sentence = new_sentence.lower()
if len(new_sentence):
new_sentence = new_sentence[0].upper() + new_sentence[1:]
converted_lines.append(new_sentence + punctuation)
return converted_lines
答案 1 :(得分:0)
除标点符号外,我做的工作。我还在考虑解决方案。这是我的代码:
def convert_file(fileName):
inputFile = open(fileName,'r')
punctuations = ['.',',','!','?',':',';']
newWords = []
linenum = 1
for line in inputFile:
line_string = line.split()
for word in line_string:
endString= str(word[1]).upper()+str(word[2:])
them=endString, str(word[0:1]).lower(), 'ay'
word = ''.join(them)
wordAndline = [word,linenum]
newWords.append(wordAndline)
linenum +=1
return newWords
它的不同之处在于它在文件中返回单词及其行。
['Histay', 1], ['Siay', 1], ['Naay', 1], ['Xample.eay', 1], ['Ymay', 3], ['Amenay', 3], ['Siay', 3], ['Ara!kay', 3]