带有连续重复字母的列表中的单词

时间:2017-11-16 12:56:40

标签: python regex

现在我有一个例如

的列表
data = ['dog','cat','a','aa','aac','bbb','bcca','ffffff']  

我想删除带有重复字母的单词,我想删除单词

'aa','aac','bbb','bcca','ffffff'

也许import re

5 个答案:

答案 0 :(得分:1)

这个问题的原始版本想要删除完全由重复单个字符组成的单词。一种有效的方法是使用集合。我们将每个单词转换为一个集合,如果它只包含一个单个字符,那么该集合的长度将为1.如果是这种情况,我们可以删除该单词,除非原始单词由单个字符组成。

data = ['dog','cat','a','aa','aac','bbb','bcca','ffffff'] 
newdata = [s for s in data if len(s) == 1 or len(set(s)) != 1]
print(newdata)

<强>输出

['dog', 'cat', 'a', 'aac', 'bcca']

这是您的问题的新版本的代码,您希望删除包含任何重复字符的单词。这个更简单,因为我们不需要对单字符单词进行特殊测试。

data = ['dog','cat','a','aa','aac','bbb','bcca','ffffff'] 
newdata = [s for s in data if len(set(s)) == len(s)]
print(newdata)

<强>输出

['dog', 'cat', 'a']

如果重复必须连续,我们可以使用groupby来处理。

from itertools import groupby

data = ['dog','cat','a','aa','aac','bbb','bcca','ffffff', 'abab', 'wow'] 
newdata = [s for s in data if max(len(list(g)) for _, g in groupby(s)) == 1]
print(newdata)

<强>输出

['dog', 'cat', 'a', 'abab', 'wow']

答案 1 :(得分:1)

感谢这个主题:Regex to determine if string is a single repeating character

这是重新版本,但如果任务很简单,我会坚持使用PM2戒指和Tameem的解决方案:

// Define data for the popup
var data = [
  {
    username: "Brad Frost", // Key "username" means that Magnific Popup will look for an element with class "mfp-username" in markup and will replace its inner HTML with the value.
    userWebsite_href: 'http://www.bradfrostweb.com', // Key "userWebsite_href" means that Magnific Popup will look for an element with class "mfp-userWebsite" and will change its "href" attribute. Instead of ending "href" you may put any other attribute.
    userAvatarUrl_img: 'https://si0.twimg.com/profile_images/1561258552/brad_frost_bigger.png', // Prefix "_img" is special. With it Magnific Popup finds an  element "userAvatarUrl" and replaces it completely with image tag.
    userLocation: 'Pittsburgh, PA'
  },

  {
    username: "Paul Irish",
    userWebsite_href: 'http://paulirish.com',
    userAvatarUrl_img: 'https://si0.twimg.com/profile_images/2910976341/7d972c32f3882f715ff84a67685e6acf_bigger.jpeg',
    userLocation: 'San Francisco'

  },

  {
    username: "Chris Coyier",
    userWebsite_href: 'https://css-tricks.com',
    userAvatarUrl_img: 'https://si0.twimg.com/profile_images/1668225011/Gravatar2_bigger.png',
    userLocation: 'Palo Alto, California'
  }
];

// initalize popup
$('button').magnificPopup({ 
  key: 'my-popup', 
  items: data,
  type: 'inline',
  inline: {
    // Define markup. Class names should match key names.
    markup: '<div class="white-popup"><div class="mfp-close"></div>'+
              '<a class="mfp-userWebsite">'+
                '<div class="mfp-userAvatarUrl"></div>'+
                '<h2 class="mfp-username"></h2>'+
              '</a>'+
              '<div class="mfp-userLocation"></div>'+
            '</div>'
  },
  gallery: {
    enabled: true 
  },
  callbacks: {
    markupParse: function(template, values, item) {
      // optionally apply your own logic - modify "template" element based on data in "values"
      // console.log('Parsing:', template, values, item);
    }
  }
});

<强>输出

import re
data = ['dog','cat','a','aa','aac','bbb','bcca','ffffff']  
[i for i in data if not re.search(r'^(.)\1+$', i)]

另一个:

['dog', 'cat', 'a', 'aac', 'bcca']

<强>输出

import re
data = ['dog','cat','a','aa','aac','bbb','bcca','ffffff']  
[i for i in data if not re.search(r'((\w)\2{1,})', i)]

答案 2 :(得分:1)

循环是要走的路。忘记关于集合,因为它们不适用于重复字母的单词。

这是一种可用于确定单词循环中单词是否有效的方法:

def is_valid(word):
    last_char = None
    for i in word:
        if i == last_char:
            return False

        last_char = i

    return True

示例

In [28]: is_valid('dogo')
Out[28]: True

In [29]: is_valid('doo')
Out[29]: False

答案 3 :(得分:1)

这是一种检查是否有连续重复字符的方法:

def has_consecutive_repeated_letters(word):
    return any(c1 == c2 for c1, c2 in zip(word, word[1:]))

然后,您可以使用列表推导来过滤列表:

words = ['dog','cat','a','aa','aac','bbb','bcca','ffffff', 'abab', 'wow']
[word for word in words if not has_consecutive_repeated_letters(word)]
# ['dog', 'cat', 'a', 'abab', 'wow']

答案 4 :(得分:0)

只需一行:)

data = ['dog','cat','a','aa','aac','bbb','bcca','ffffff']  
data =  [value for value in data if(len(set(value))!=1 or len(value) ==1)]
print(data)

<强>输出

['dog', 'cat', 'a', 'aac', 'bcca']