在字符串中查找第一个未重复的字符

时间:2010-02-18 00:41:11

标签: algorithm language-agnostic string

查找仅在字符串中出现一次的第一个字符的最快方法是什么?

34 个答案:

答案 0 :(得分:32)

它必须至少是O(n),因为在你读完所有字符之前你不知道是否会重复一个字符。

因此,您可以迭代字符并在第一次看到它时将每个字符附加到列表中,并单独记录您看到它的次数(事实上,对于计数而言,唯一重要的值是“0”,“1”或“超过1”)。

当你到达字符串的末尾时,你必须找到列表中第一个只有一个计数的字符。


Python中的示例代码:

def first_non_repeated_character(s):
    counts = defaultdict(int)
    l = []
    for c in s:
        counts[c] += 1
        if counts[c] == 1:
            l.append(c)

    for c in l:
        if counts[c] == 1:
            return c

    return None

这在O(n)中运行。

答案 1 :(得分:14)

在处理整个字符串之前,你不可能知道字符是不重复的,所以我的建议是:

def first_non_repeated_character(string):
  chars = []
  repeated = []
  for character in string:
    if character in chars:
      chars.remove(character)
      repeated.append(character)
    else:
      if not character in repeated:
        chars.append(character)
  if len(chars):
    return chars[0]
  else:
    return False

编辑:最初发布的代码很糟糕,但这个最新的代码片段是Ryan的计算机™认证工作。

答案 2 :(得分:4)

为什么不使用基于堆的数据结构,例如最小优先级队列。当您从字符串中读取每个字符时,将其添加到队列中,其优先级取决于字符串中的位置和到目前为止的出现次数。您可以修改队列以在碰撞时添加优先级,以便角色的优先级是该角色的出现次数的总和。在循环结束时,队列中的第一个元素将是字符串中最不频繁的字符,如果有多个字符且count == 1,则第一个元素是添加到队列中的第一个唯一字符。

答案 3 :(得分:3)

许多答案正在尝试O(n),但是忘记了从他们用来跟踪的列表/关联数组/集中插入和删除的实际成本。

如果您可以假设char是单个字节,那么您使用由char索引的简单数组并在其中保留计数。这是真正的O(n)因为数组访问是有保证的O(1),并且最后通过数组来找到第一个元素是1是恒定时间(因为数组有一个小的固定大小)。

如果您不能假设char是单个字节,那么我建议对字符串进行排序,然后单次检查相邻的值。对于最终传递,这将是排序加O(n)的O(n log n)。所以它实际上是O(n log n),它优于O(n ^ 2)。此外,它几乎没有空间开销,这是许多尝试O(n)的答案的另一个问题。

答案 4 :(得分:3)

这是另一种有趣的方式。计数器需要 Python2.7 Python3.1

>>> from collections import Counter
>>> def first_non_repeated_character(s):
...     return min((k for k,v in Counter(s).items() if v<2), key=s.index)
...
>>> first_non_repeated_character("aaabbbcddd")
'c'
>>> first_non_repeated_character("aaaebbbcddd")
'e'

答案 5 :(得分:2)

重构前面提出的解决方案(不必使用额外的列表/内存)。这会超过字符串两次。所以这就像原始解决方案一样需要O(n)。

def first_non_repeated_character(s):
    counts = defaultdict(int)
    for c in s:
        counts[c] += 1
    for c in s:
        if counts[c] == 1:
            return c
    return None

答案 6 :(得分:2)

以下是查找字符串的第一个非重复字符的Ruby实现:

def first_non_repeated_character(string)
  string1 = string.split('')
  string2 = string.split('')

  string1.each do |let1|
    counter = 0
    string2.each do |let2|
      if let1 == let2
        counter+=1
      end
    end
  if counter == 1 
    return let1
    break
  end
end
end

p first_non_repeated_character('dont doddle in the forest')

这是一个相同样式函数的JavaScript实现:

var first_non_repeated_character = function (string) {
  var string1 = string.split('');
  var string2 = string.split('');

  var single_letters = [];

  for (var i = 0; i < string1.length; i++) {
    var count = 0;
    for (var x = 0; x < string2.length; x++) {
      if (string1[i] == string2[x]) {
        count++
      }
    }
    if (count == 1) {
      return string1[i];
    }
  }
}

console.log(first_non_repeated_character('dont doddle in the forest'));
console.log(first_non_repeated_character('how are you today really?'));

在这两种情况下,我都使用了一个计数器,知道如果字母在字符串中的任何地方都不匹配,它只会出现在字符串中一次,所以我只计算它的出现次数。

答案 7 :(得分:2)

我认为这应该在C中进行。这在O(n)时间内运行,没有关于插入和删除操作符的顺序的模糊性。这是一种计数排序(最简单的铲斗排序形式,它本身就是基数排序的简单形式)。

unsigned char find_first_unique(unsigned char *string)
{
    int chars[256];
    int i=0;
    memset(chars, 0, sizeof(chars));

    while (string[i++])
    {
        chars[string[i]]++;
    }

    i = 0;
    while (string[i++])
    {
        if (chars[string[i]] == 1) return string[i];
    }
    return 0;
}

答案 8 :(得分:2)

计数器需要 Python2.7 Python3.1

>>> from collections import Counter
>>> def first_non_repeated_character(s):
...     counts = Counter(s)
...     for c in s:
...         if counts[c]==1:
...             return c
...     return None
... 
>>> first_non_repeated_character("aaabbbcddd")
'c'
>>> first_non_repeated_character("aaaebbbcddd")
'e'

答案 9 :(得分:1)

def first_non_repeated_character(string):
  chars = []
  repeated = []
  for character in string:
    if character in repeated:
        ... discard it.
    else if character in chars:
      chars.remove(character)
      repeated.append(character)
    else:
      if not character in repeated:
        chars.append(character)
  if len(chars):
    return chars[0]
  else:
    return False

答案 10 :(得分:1)

在Ruby中:

(原始信用:Andrew A. Smith)

x = "a huge string in which some characters repeat"

def first_unique_character(s)
 s.each_char.detect { |c| s.count(c) == 1 }
end

first_unique_character(x)
=> "u"

答案 11 :(得分:1)

其他JavaScript解决方案是非常c风格的解决方案,这是一种更加JavaScript风格的解决方案。

var arr = string.split("");
var occurences = {};
var tmp;
var lowestindex = string.length+1;

arr.forEach( function(c){ 
  tmp = c;
  if( typeof occurences[tmp] == "undefined")
    occurences[tmp] = tmp;
  else 
    occurences[tmp] += tmp;
});


for(var p in occurences) {
  if(occurences[p].length == 1)
    lowestindex = Math.min(lowestindex, string.indexOf(p));
}

if(lowestindex > string.length)
  return null;

return string[lowestindex];

}

答案 12 :(得分:0)

问题:字符串的第一个唯一字符 这是最简单的解决方案。

public class Test4 {
    public static void main(String[] args) {
        String a = "GiniGinaProtijayi";

        firstUniqCharindex(a);
    }

    public static void firstUniqCharindex(String a) {
        int[] count = new int[256];
        for (int i = 0; i < a.length(); i++) {
            count[a.charAt(i)]++;
        }
        int index = -1;
        for (int i = 0; i < a.length(); i++) {
            if (count[a.charAt(i)] == 1) {
                index = i;
                break;
            } // if
        }
        System.out.println(index);// output => 8
        System.out.println(a.charAt(index)); //output => P

    }// end1
}

在Python中:

def firstUniqChar(a):
  count = [0] * 256
  for i in a: count[ord(i)] += 1 
  element = ""
  for items in a:
      if(count[ord(items) ] == 1):
          element = items ;
          break
  return element


a = "GiniGinaProtijayi";
print(firstUniqChar(a)) # output is P

使用Java 8:

public class Test2 {
    public static void main(String[] args) {
        String a = "GiniGinaProtijayi";

        Map<Character, Long> map = a.chars()
                .mapToObj(
                        ch -> Character.valueOf((char) ch)

        ).collect(
                Collectors.groupingBy(
                        Function.identity(), 
                        LinkedHashMap::new,
                        Collectors.counting()));

        System.out.println("MAP => " + map);
        // {G=2, i=5, n=2, a=2, P=1, r=1, o=1, t=1, j=1, y=1}

        Character chh = map
                .entrySet()
                .stream()
                .filter(entry -> entry.getValue() == 1L)
                .map(entry -> entry.getKey())
                .findFirst()
                .get();
        System.out.println("First Non Repeating Character => " + chh);// P
    }// main

}

答案 13 :(得分:0)

我仔细阅读了答案,但没有看到任何类似我的答案,我认为这个答案非常简单快捷,我错了吗?

def first_unique(s):
    repeated = []

    while s:
        if s[0] not in s[1:] and s[0] not in repeated:
            return s[0]
        else:
            repeated.append(s[0])
            s = s[1:]
    return None

测试

(first_unique('abdcab') == 'd', first_unique('aabbccdad') == None, first_unique('') == None, first_unique('a') == 'a')

答案 14 :(得分:0)

这是另一个具有o(n)时间复杂度的解决方案。

public void findUnique(String string) {
    ArrayList<Character> uniqueList = new ArrayList<>();
    int[] chatArr = new int[128];
    for (int i = 0; i < string.length(); i++) {
        Character ch = string.charAt(i);
        if (chatArr[ch] != -1) {
            chatArr[ch] = -1;
            uniqueList.add(ch);
        } else {
            uniqueList.remove(ch);
        }
    }
    if (uniqueList.size() == 0) {
        System.out.println("No unique character found!");
    } else {
        System.out.println("First unique character is :" + uniqueList.get(0));
    }
}

答案 15 :(得分:0)

以下解决方案是使用作为Java 8一部分引入的新功能查找字符串中第一个唯一字符的优雅方法。此解决方案使用首先创建映射来计算出现次数的方法。每个角色。然后它使用此映射来查找仅出现一次的第一个字符。这在O(N)时间内运行。

import static java.util.stream.Collectors.counting;
import static java.util.stream.Collectors.groupingBy;

import java.util.Arrays;
import java.util.List;
import java.util.Map;

// Runs in O(N) time and uses lambdas and the stream API from Java 8
//   Also, it is only three lines of code!
private static String findFirstUniqueCharacterPerformantWithLambda(String inputString) {
  // convert the input string into a list of characters
  final List<String> inputCharacters = Arrays.asList(inputString.split(""));

  // first, construct a map to count the number of occurrences of each character
  final Map<Object, Long> characterCounts = inputCharacters
    .stream()
    .collect(groupingBy(s -> s, counting()));

  // then, find the first unique character by consulting the count map
  return inputCharacters
    .stream()
    .filter(s -> characterCounts.get(s) == 1)
    .findFirst()
    .orElse(null);
}

答案 16 :(得分:0)

我有两个字符串,即'unique'和'repeated'。第一次出现的每个角色都会被添加到“独特”中。如果它第二次重复,它将从“唯一”中删除并添加到“重复”。这样,我们将始终在'unique'中包含一串唯一字符。 复杂性大O(n)

public void firstUniqueChar(String str){
    String unique= "";
    String repeated = "";
    str = str.toLowerCase();
    for(int i=0; i<str.length();i++){
        char ch = str.charAt(i);
        if(!(repeated.contains(str.subSequence(i, i+1))))
            if(unique.contains(str.subSequence(i, i+1))){
                unique = unique.replaceAll(Character.toString(ch), "");
                repeated = repeated+ch;
            }
            else
                unique = unique+ch;
    }
    System.out.println(unique.charAt(0));
}

答案 17 :(得分:0)

功能:

此c#函数使用HashTable(字典)并具有性能O(2n)最差情况。

private static string FirstNoRepeatingCharacter(string aword)
    {
        Dictionary<string, int> dic = new Dictionary<string, int>();            

        for (int i = 0; i < aword.Length; i++)
        {
            if (!dic.ContainsKey(aword.Substring(i, 1)))
                dic.Add(aword.Substring(i, 1), 1);
            else
                dic[aword.Substring(i, 1)]++;
        }

        foreach (var item in dic)
        {
            if (item.Value == 1) return item.Key;
        }
        return string.Empty;
    }

示例:

  

string aword =“TEETER”;

     

Console.WriteLine(FirstNoRepeatingCharacter(AW​​ORD)); // print:R

答案 18 :(得分:0)

这是另一种方法......我们可以有一个数组,它将存储第一次出现的字符的计数和索引。在填满数组后,我们可以jst遍历数组并找到计数为1的MINIMUM索引然后返回str [index]

#include <iostream>
#include <cstdio>
#include <cstdlib>
#include <climits>
using namespace std;

#define No_of_chars 256

//store the count and the index where the char first appear
typedef struct countarray
{
    int count;
    int index;
}countarray;

//returns the count array
    countarray *getcountarray(char *str)
    {
        countarray *count;
        count=new countarray[No_of_chars];
        for(int i=0;i<No_of_chars;i++)
        {
            count[i].count=0;
            count[i].index=-1;
        }
        for(int i=0;*(str+i);i++)
        {
            (count[*(str+i)].count)++;
            if(count[*(str+i)].count==1) //if count==1 then update the index
                count[*(str+i)].index=i; 

        }
        return count;
    }

    char firstnonrepeatingchar(char *str)
    {
        countarray *array;
        array = getcountarray(str);
        int result = INT_MAX;
        for(int i=0;i<No_of_chars;i++)
        {
            if(array[i].count==1 && result > array[i].index)
                result = array[i].index;
        }
        delete[] (array);
        return (str[result]);
    }

    int main()
    {
        char str[] = "geeksforgeeks";
        cout<<"First non repeating character is "<<firstnonrepeatingchar(str)<<endl;        
        return 0;
    }

答案 19 :(得分:0)

在C中,这几乎是Shlemiel the Painter's Algorithm(不是O(n!)但是大于0(n2))。

但是对于合理大小的字符串,它会胜过“更好”的算法,因为O非常小。这也可以轻松地告诉您第一个非重复字符串的位置

char FirstNonRepeatedChar(char * psz)
{
   for (int ii = 0; psz[ii] != 0; ++ii)
   {
      for (int jj = ii+1; ; ++jj)
      {
         // if we hit the end of string, then we found a non-repeat character.
         //
         if (psz[jj] == 0)
            return psz[ii]; // this character doesn't repeat

         // if we found a repeat character, we can stop looking.
         //
         if (psz[ii] == psz[jj])
            break; 
      }
   }

   return 0; // there were no non-repeating characters.
}

编辑:此代码假设您不是指连续重复字符。

答案 20 :(得分:0)

以下代码位于C#中,复杂度为n。

using System;
using System.Linq;
using System.Text;

namespace SomethingDigital
{
    class FirstNonRepeatingChar
    {
        public static void Main()
        {
            String input = "geeksforgeeksandgeeksquizfor";
            char[] str = input.ToCharArray();

            bool[] b = new bool[256];
            String unique1 = "";
            String unique2 = "";

            foreach (char ch in str)
            {
                if (!unique1.Contains(ch))
                {
                    unique1 = unique1 + ch;
                    unique2 = unique2 + ch;
                }
                else
                {
                    unique2 = unique2.Replace(ch.ToString(), "");
                }
            }
            if (unique2 != "")
            {
                Console.WriteLine(unique2[0].ToString());
                Console.ReadLine();
            }
            else
            {
                Console.WriteLine("No non repeated string");
                Console.ReadLine();
            }
        }
    }
}

答案 21 :(得分:0)

输入= aabbcddeef输出= c

char FindUniqueChar(char *a)
{
    int i=0;
    bool repeat=false;
    while(a[i] != '\0')
    {
      if (a[i] == a[i+1])
      {
        repeat = true;
      }
      else
      {
            if(!repeat)
            {
            cout<<a[i];
            return a[i];
            }
        repeat=false;
      }
      i++;
    }
    return a[i];
}

答案 22 :(得分:0)

这里有不同的方法。 扫描字符串中的每个元素并创建一个计数数组,该数组存储每个元素的重复次数。 下次再次从数组中的第一个元素开始,并使用count = 1

打印第一个元素
C code 
-----
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
    char t_c;
    char *t_p = argv[1] ;
    char count[128]={'\0'};
    char ch;

    for(t_c = *(argv[1]); t_c != '\0'; t_c = *(++t_p))
        count[t_c]++;
    t_p = argv[1];
    for(t_c = *t_p; t_c != '\0'; t_c = *(++t_p))
    {
        if(count[t_c] == 1)
        {
            printf("Element is %c\n",t_c);
            break;
        }
    }

return 0;    
} 

答案 23 :(得分:0)

JavaScript中的代码段

var string = "tooth";
var hash = [];
for(var i=0; j=string.length, i<j; i++){
    if(hash[string[i]] !== undefined){
        hash[string[i]] = hash[string[i]] + 1;
    }else{
        hash[string[i]] = 1;
    }
}

for(i=0; j=string.length, i<j; i++){
    if(hash[string[i]] === 1){
        console.info( string[i] );
        return false;
    }
}
// prints "h"

答案 24 :(得分:0)

Mathematica中,可以写下这个:

string = "conservationist deliberately treasures analytical";

Cases[Gather @ Characters @ string, {_}, 1, 1][[1]]
{"v"}

答案 25 :(得分:0)

试试这段代码:

    public static String findFirstUnique(String str)
    {
        String unique = "";

        foreach (char ch in str)
        {
            if (unique.Contains(ch)) unique=unique.Replace(ch.ToString(), "");
            else unique += ch.ToString();
        }
        return unique[0].ToString();
    }

答案 26 :(得分:0)

这是ruby中可能的解决方案,而不使用Array#detect(如this answer中所述)。我认为使用Array#detect太容易了。

ALPHABET = %w(a b c d e f g h i j k l m n o p q r s t u v w x y z)

def fnr(s)
  unseen_chars    = ALPHABET.dup
  seen_once_chars = []
  s.each_char do |c|
    if unseen_chars.include?(c)
      unseen_chars.delete(c)
      seen_once_chars << c
    elsif seen_once_chars.include?(c)
      seen_once_chars.delete(c)
    end
  end

  seen_once_chars.first
end

似乎适用于一些简单的例子:

fnr "abcdabcegghh"
# => "d"

fnr "abababababababaqababa"                                    
=> "q"

非常感谢建议和更正!

答案 27 :(得分:0)

这是Perl中的一个实现(版本&gt; = 5.10),它不关心重复的字符是否连续:

use strict;
use warnings;

foreach my $word(@ARGV)
{
  my @distinct_chars;
  my %char_counts;

  my @chars=split(//,$word);

  foreach (@chars)
  {
    push @distinct_chars,$_ unless $_~~@distinct_chars;
    $char_counts{$_}++;
  }

  my $first_non_repeated="";

  foreach(@distinct_chars)
  {
    if($char_counts{$_}==1)
    {
      $first_non_repeated=$_;
      last;
    }
  }

  if(length($first_non_repeated))
  {
    print "For \"$word\", the first non-repeated character is '$first_non_repeated'.\n";
  }
  else
  {
    print "All characters in \"$word\" are repeated.\n";
  }
}

将此代码存储在脚本中(我将其命名为non_repeated.pl)并在几个输入上运行它会产生:

jmaney> perl non_repeated.pl aabccd "a huge string in which some characters repeat" abcabc
For "aabccd", the first non-repeated character is 'b'.
For "a huge string in which some characters repeat", the first non-repeated character is 'u'.
All characters in "abcabc" are repeated.

答案 28 :(得分:-1)

如何在这种情况下使用后缀树...第一个未重复的字符将是树中深度最小的最长后缀字符串的第一个字符..

答案 29 :(得分:-1)

如果char数组连续包含重复的字符(例如“ggddaaccceefgg”),那么下面的代码就可以工作:

char FirstNonRepeatingChar(char* str)
{
     int i=0;
     bool repeat = false;
     while(str[i]!='\0')
     {
       if(str[i] != str[i+1])
       {
         if(!repeat)
           return(str[i]);
         repeat = false;
       }
       else
        repeat = true;
      i++;
    }
return ' ';
}

答案 30 :(得分:-1)

Another answer(Might not be so efficient but a kind of approach that we can try in c++; Time complexity:O(n) Space complexity:O(n)).

char FirstNotRepeatingCharInString(char *str)
{
    //Lets assume a set  to insert chars of string
    char result;
    multiset<char> st;
    for(int i=0;i<strlen(str);i++)
    {
        st.insert(str[i]);  
    }
    for(int i=0;i<strlen(str);i++)
    {
        if(st.count(str[i]) <=1){
            result = str[i];
            break;
        }
    }
    return result;
}

答案 31 :(得分:-1)

  /**
   ** Implemented using linkedHashMap with key as character and value as Integer.
   *
   *  Used linkedHashMap to get the wordcount. 
   *  This will return a map with wordcount in the same order as in the string.
   *  
   *  Using the iterator get the first key which has value as 1.
   *  
   */

package com.simple.quesions;

import java.util.Collection;
import java.util.HashMap;
import java.util.Iterator;
import java.util.LinkedHashMap;
import java.util.Map;
import java.util.Set;

public class FirstNonRepeatingChar 
{
    public static void main(String args[])
    {
        String a = "theskyisblue";
        System.out.println(findNonRepeating(a));
    }

    // Function which returns the first non repeating character.

    public static char findNonRepeating(String str)
    {
       // Create a linked hash map 

        LinkedHashMap<Character,Integer> map = new LinkedHashMap();
        if(str == null)
            return ' ';
        else
        {

        // function to get the word count 

            for(int i=0;i<str.length();i++)
            {
                char val = str.charAt(i);
                if(val != ' ')
                {
                    if(map.containsKey(val))
                    {
                        map.put(val,map.get(val)+1);
                }
                else
                {
                    map.put(val, 1);
                }
            }
        }

        System.out.println(map);
    }

    // get all the keys in the set and use it in iterator.
    Set keys = map.keySet();
    Iterator itr = keys.iterator();
    char key;
    int val;

    // get the first key which has value as " 1 " .

    while(itr.hasNext())
    {
         key = (Character) itr.next();
         val = (Integer) map.get(key);
         if(val == 1)
             return key;
    }

    return ' ';
}

}

答案 32 :(得分:-1)

创建两个列表 -

  1. 唯一列表 - 仅具有唯一字符.. UL
  2. 非唯一列表 - 仅重复字符-NUL
  3.   for(char c in str) {
        if(nul.contains(c)){
         //do nothing
        }else if(ul.contains(c)){
          ul.remove(c);
          nul.add(c);
        }else{
           nul.add(c);
        }
    

答案 33 :(得分:-2)

以下是C ++中的解决方案:

 char FirstNotRepeatingChar(char* pString)
 {
     if(pString == NULL)
         return '\0';

     const int tableSize = 256;
     unsigned int hashTable[tableSize];
     for(unsigned int i = 0; i<tableSize; ++ i)
         hashTable[i] = 0;

     char* pHashKey = pString;
     while(*(pHashKey) != '\0')
         hashTable[*(pHashKey++)] ++;

     pHashKey = pString;
     while(*pHashKey != '\0')
     {
         if(hashTable[*pHashKey] == 1)
             return *pHashKey;

         pHashKey++;
     }

     return '\0';
 }

我有一个博客在http://codercareer.blogspot.com/2011/10/no-13-first-character-appearing-only.html讨论这个问题。任何评论都将深表感谢。