在保持句子结构和标点符号不变的情况下,对单词的字符进行混洗

时间:2018-12-26 11:47:33

标签: python python-3.x string algorithm shuffle

因此,我希望能够对句子中的单词进行加扰,但是:

  • 句子中的单词顺序保持不变。
  • 如果单词以大写字母开头,则混乱的单词也必须以大写字母开头 (即首字母大写)。

  • 标点符号。 ,; !和?需要保留。

例如,对于句子“汤姆和我在电影院看过《星球大战》, 一个有趣的版本是“ Mto nad I wachtde Tars Rswa ni het amecin,ti wsa fnu!”。

from random import shuffle

def shuffle_word(word):
    word = list(word)
    if word.title():
        ????   #then keep first capital letter in same position in word?
    elif char == '!' or '.' or ',' or '?':
        ????  #then keep their position?
    else:
        shuffle(word)
    return''.join(word)
L = input('try enter a sentence:').split()
print([shuffle_word(word) for word in L])

我可以理解如何将句子中的每个单词弄得乱七八糟,但是...难于使用if语句来应用细节?请帮忙!

4 个答案:

答案 0 :(得分:1)

很高兴看到您了解了大多数逻辑。

要保持首字母大写,您可以事先检查一下,然后再将“新”首字母大写。

first_letter_is_cap = word[0].isupper()

shuffle(word)

if first_letter_is_cap:
    # Re-capitalize first letter
    word[0] = word[0].upper()

要保持结尾标点的位置,请先将其删除,然后再添加回去:

last_char = word[-1]
if last_char in ".,;!?":
    # Strip the punctuation
    word = word[:-1]

shuffle(word)

if last_char in ".,;!?":
    # Add it back
    word.append(last_char)

答案 1 :(得分:1)

这是我的代码。与您的逻辑没什么不同。随时优化代码。

save_model()

以上代码执行的输出:

import random

def shuffle_word(words):
    words_new = words.split(" ")
    out=''
    for word in words_new:
        l = list(word)
        if word.istitle():
            result = ''.join(random.sample(word, len(word)))
            out = out + ' ' + result.title()
        elif any(i in word for i in ('!','.',',')):
            result = ''.join(random.sample(word[:-1], len(word)-1))
            out = out + ' ' + result+word[-1]
        else:
            result = ''.join(random.sample(word, len(word)))
            out = out +' ' + result
    return (out[1:])
L = "Tom and I watched Star Wars in the cinema, it was fun!"
print(shuffle_word(L))

希望有帮助。干杯!

答案 2 :(得分:1)

由于这是一个字符串处理算法,因此我将考虑使用正则表达式。正则表达式为您提供了更大的灵活性,更简洁的代码,并且您可以摆脱边缘情况的条件。例如,此代码无需任何其他代码即可处理撇号,数字,引号和日期和时间等特殊短语,而您只需更改正则表达式的模式即可对其进行控制。

from random import shuffle
import re

# Characters considered part of words
pattern = r"[A-Za-z']+"

# shuffle and lowercase word characters
def shuffle_word(word):
    w = list(word)
    shuffle(w)
    return ''.join(w).lower()

# fucntion to shuffle word used in replace
def replace_func(match):
    return shuffle_word(match.group())

def shuffle_str(str):
    # replace words with their shuffled version
    shuffled_str = re.sub(pattern, replace_func, str)

    # find original uppercase letters
    uppercase_letters = re.finditer(r"[A-Z]", str)

    # make new characters in uppercase positions uppercase
    char_list = list(shuffled_str)
    for match in uppercase_letters:
        uppercase_index = match.start()
        char_list[uppercase_index] = char_list[uppercase_index].upper()

    return ''.join(char_list)

print(shuffle_str('''Tom and I watched "Star Wars" in the cinema's new 3D theater yesterday at 8:00pm, it was fun!'''))

答案 3 :(得分:0)

这适用于任何句子,即使是连续的“特殊”字符,也保留所有标点符号:

from random import sample

def shuffle_word(sentence):
  new_sentence=""
  word=""
  for i,char in enumerate(sentence+' '):
    if char.isalpha():
      word+=char
    else:
      if word:
        if len(word)==1:
          new_sentence+=word
        else:
          new_word=''.join(sample(word,len(word)))
          if word==word.title():
            new_sentence+=new_word.title()
          else:
            new_sentence+=new_word
        word=""
      new_sentence+=char
  return new_sentence


text="Tom and I watched Star Wars in the cinema, it was... fun!"

print(shuffle_word(text))

输出:

Mto nda I hctawed Rast Aswr in the animec, ti asw... fnu!