发现两个词是否是彼此的字谜

时间:2010-11-21 07:43:07

标签: algorithm

我正在寻找一种方法来查找两个字符串是否是彼此的字谜。

Ex: string1 - abcde
string2 - abced
Ans = true
Ex: string1 - abcde
string2 - abcfed
Ans = false

我提出的解决方案是为了对两个字符串进行排序并比较两个字符串中的每个字符,直到任一字符串的结尾。这将是O(logn)。我正在寻找一些其他有效的方法,它不会t改变被比较的2个字符串

22 个答案:

答案 0 :(得分:52)

计算两个字符串中每个字符的频率。检查两个直方图是否匹配。 O(n)时间,O(1)空间(假设为ASCII)(当然它仍然是Unicode的O(1)空间,但表格将变得非常大。)

答案 1 :(得分:32)

获取素数表,足以将每个素数映射到每个字符。所以从1开始,经过一行,将数字乘以代表当前字符的素数。您将获得的数字仅取决于字符串中的字符,而不取决于它们的顺序,并且每个唯一的字符集对应于唯一的数字,因为任何数字都可以仅以一种方式计算。所以你可以比较两个数字来说明一个字符串是否是彼此的字谜。

不幸的是,您必须使用多个精度(任意精度)整数运算来执行此操作,否则在使用此方法时会出现溢出或舍入异常。
为此,您可以使用BigIntegerGMPMPIRIntX等库。

伪代码:

prime[] = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101}

primehash(string)
    Y = 1;
    foreach character in string
        Y = Y * prime[character-'a']

    return Y

isanagram(str1, str2)
    return primehash(str1)==primehash(str2)

答案 2 :(得分:22)

  1. 创建一个Hashmap,其中键 - 字母和值 - 字母的频率,
  2. 用于第一个字符串填充哈希映射(O(n))
  3. 用于第二个字符串递减计数并从散列映射O(n)中删除元素
  4. 如果hashmap为空,则字符串为anagram,否则不是。

答案 3 :(得分:8)

步骤如下:

  1. 检查两个单词/字符串的长度是否相等然后只检查anagram否则什么都不做
  2. 对单词/字符串进行排序然后比较
  3. 相同的JAVA代码:

    /*
     * To change this template, choose Tools | Templates
     * and open the template in the editor.
     */
    package anagram;
    
    import java.io.BufferedReader;
    import java.io.IOException;
    import java.io.InputStreamReader;
    import java.util.Arrays;
    
    /**
     *
     * @author Sunshine
     */
    public class Anagram {
    
        /**
         * @param args the command line arguments
         */
        public static void main(String[] args) throws IOException {
            // TODO code application logic here
            System.out.println("Enter the first string");
            BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
            String s1 = br.readLine().toLowerCase();
            System.out.println("Enter the Second string");
            BufferedReader br2 = new BufferedReader(new InputStreamReader(System.in));
            String s2 = br2.readLine().toLowerCase();
            char c1[] = null;
            char c2[] = null;
            if (s1.length() == s2.length()) {
    
    
                c1 = s1.toCharArray();
                c2 = s2.toCharArray();
    
                Arrays.sort(c1);
                Arrays.sort(c2);
    
                if (Arrays.equals(c1, c2)) {
                    System.out.println("Both strings are equal and hence they have anagram");
                } else {
                    System.out.println("Sorry No anagram in the strings entred");
                }
    
            } else {
                System.out.println("Sorry the string do not have anagram");
            }
        }
    }
    

答案 4 :(得分:3)

C#

public static bool AreAnagrams(string s1, string s2)
{
  if (s1 == null) throw new ArgumentNullException("s1");
  if (s2 == null) throw new ArgumentNullException("s2");

  var chars = new Dictionary<char, int>();
  foreach (char c in s1)
  {
      if (!chars.ContainsKey(c))
          chars[c] = 0;
      chars[c]++;
  }
  foreach (char c in s2)
  {
      if (!chars.ContainsKey(c))
          return false;
      chars[c]--;
  }

  return chars.Values.All(i => i == 0);
}

一些测试:

[TestMethod]
public void TestAnagrams()
{
  Assert.IsTrue(StringUtil.AreAnagrams("anagramm", "nagaramm"));
  Assert.IsTrue(StringUtil.AreAnagrams("anzagramm", "nagarzamm"));
  Assert.IsTrue(StringUtil.AreAnagrams("anz121agramm", "nag12arz1amm"));
  Assert.IsFalse(StringUtil.AreAnagrams("anagram", "nagaramm"));
  Assert.IsFalse(StringUtil.AreAnagrams("nzagramm", "nagarzamm"));
  Assert.IsFalse(StringUtil.AreAnagrams("anzagramm", "nag12arz1amm"));
}

答案 5 :(得分:2)

使用ASCII哈希映射,允许对每个字符进行O(1)查找。

上面列出的java示例正在转换为看似不完整的小写。我在C中有一个例子,它只是简单地将ASCII values的哈希映射数组初始化为&#39; -1&#39;

如果string2的长度与字符串1不同,则没有anagrams

否则,我们将string1和string2

中每个char的相应哈希映射值更新为0

然后对于string1中的每个char,我们更新hash-map中的计数。类似地,我们减少string2中每个char的计数值。

如果每个字符都是字谜,则结果的值应设置为0。如果没有,string1设置的一些正值仍为

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define ARRAYMAX 128

#define True        1
#define False       0

int isAnagram(const char *string1, 
            const char *string2) {

    int str1len = strlen(string1);
    int str2len = strlen(string2);

    if (str1len != str2len) /* Simple string length test */
        return False;

    int * ascii_hashtbl = (int * ) malloc((sizeof(int) * ARRAYMAX));
    if (ascii_hashtbl == NULL) {
        fprintf(stderr, "Memory allocation failed\n");
        return -1;
    }
    memset((void *)ascii_hashtbl, -1, sizeof(int) * ARRAYMAX);
    int index = 0;
    while (index < str1len) { /* Populate hash_table for each ASCII value 
                                in string1*/
        ascii_hashtbl[(int)string1[index]] = 0;
        ascii_hashtbl[(int)string2[index]] = 0;
        index++;
    }
    index = index - 1;
    while (index >= 0) {
        ascii_hashtbl[(int)string1[index]]++; /* Increment something */
        ascii_hashtbl[(int)string2[index]]--; /* Decrement something */
        index--;
    }
    /* Use hash_table to compare string2 */
    index = 0;
    while (index < str1len) {
        if (ascii_hashtbl[(int)string1[index]] != 0) {
            /* some char is missing in string2 from string1 */
            free(ascii_hashtbl);
            ascii_hashtbl = NULL;
            return False;
        }
        index++;
    }
    free(ascii_hashtbl);
    ascii_hashtbl = NULL;
    return True;
}

int main () {
    char array1[ARRAYMAX], array2[ARRAYMAX];
    int flag;

    printf("Enter the string\n");
    fgets(array1, ARRAYMAX, stdin);
    printf("Enter another string\n");
    fgets(array2, ARRAYMAX, stdin);

    array1[strcspn(array1, "\r\n")] = 0;
    array2[strcspn(array2, "\r\n")] = 0;
    flag = isAnagram(array1, array2);
    if (flag == 1)
        printf("%s and %s are anagrams.\n", array1, array2);
    else if (flag == 0)
        printf("%s and %s are not anagrams.\n", array1, array2);

    return 0;
}

答案 6 :(得分:2)

查找两个单词是否为字谜的代码:

Logic已经在几个答案中解释,很少有人要求代码。该解决方案在O(n)时间内产生结果。

此方法计算每个字符的出现次数,并将其存储在每个字符串的相应ASCII位置。然后比较两个阵列计数。如果它不相等,则给定的字符串不是字谜。

public boolean isAnagram(String str1, String str2)
{
    //To get the no of occurrences of each character and store it in their ASCII location
    int[] strCountArr1=getASCIICountArr(str1);
    int[] strCountArr2=getASCIICountArr(str2);

    //To Test whether the two arrays have the same count of characters. Array size 256 since ASCII 256 unique values
    for(int i=0;i<256;i++)
    {
        if(strCountArr1[i]!=strCountArr2[i])
            return false;
    }
    return true;
}

public int[] getASCIICountArr(String str)
{
    char c;
    //Array size 256 for ASCII
    int[] strCountArr=new int[256];
    for(int i=0;i<str.length();i++)
    {
        c=str.charAt(i); 
        c=Character.toUpperCase(c);// If both the cases are considered to be the same
        strCountArr[(int)c]++; //To increment the count in the character's ASCII location
    }
    return strCountArr;
}

答案 7 :(得分:1)

嗯,你可以通过首先检查长度,然后快速检查数字(不是复杂的,因为这可能比排序更糟糕的顺序,只是顺序的总和)来改善最佳情况和平均情况值),然后排序,然后比较。

如果字符串非常短,则校验和开销与许多语言中的排序差别不大。

答案 8 :(得分:1)

让我们问一个问题:给定两个字符串s和t,写一个函数来确定t是否是s的字谜。

例如, s =“anagram”,t =“nagaram”,返回true。 s =“rat”,t =“car”,返回false。

方法1(使用HashMap):

public class Method1 {

    public static void main(String[] args) {
        String a = "protijayi";
        String b = "jayiproti";
        System.out.println(isAnagram(a, b ));// output => true

    }

    private static boolean isAnagram(String a, String b) {
        Map<Character ,Integer> map = new HashMap<>();
        for( char c : a.toCharArray()) {
            map.put(c,    map.getOrDefault(c, 0 ) + 1 );
        }
        for(char c : b.toCharArray()) {
            int count = map.getOrDefault(c, 0);
            if(count  == 0 ) {return false ; }
            else {map.put(c, count - 1 ) ; }
        }

        return true;
    }

}

方法2:

public class Method2 {
public static void main(String[] args) {
    String a = "protijayi";
    String b = "jayiproti";


    System.out.println(isAnagram(a, b));// output=> true
}

private static boolean isAnagram(String a, String b) {


    int[] alphabet = new int[26];
    for(int i = 0 ; i < a.length() ;i++) {
         alphabet[a.charAt(i) - 'a']++ ;
    }
    for (int i = 0; i < b.length(); i++) {
         alphabet[b.charAt(i) - 'a']-- ;
    }

    for(  int w :  alphabet ) {
         if(w != 0 ) {return false;}
    }
    return true;

}
}

方法3:

public class Method3 {
public static void main(String[] args) {
    String a = "protijayi";
    String b = "jayiproti";


    System.out.println(isAnagram(a, b ));// output => true
}

private static boolean isAnagram(String a, String b) {
    char[] ca = a.toCharArray() ;
    char[] cb = b.toCharArray();
    Arrays.sort(   ca     );

    Arrays.sort(   cb        );
    return Arrays.equals(ca , cb );
}
}

方法4:

public class AnagramsOrNot {
    public static void main(String[] args) {
        String a = "Protijayi";
        String b = "jayiProti";
        isAnagram(a, b);
    }

    private static void isAnagram(String a, String b) {
        Map<Integer, Integer> map = new LinkedHashMap<>();

        a.codePoints().forEach(code -> map.put(code, map.getOrDefault(code, 0) + 1));
        System.out.println(map);
        b.codePoints().forEach(code -> map.put(code, map.getOrDefault(code, 0) - 1));
        System.out.println(map);
        if (map.values().contains(0)) {
            System.out.println("Anagrams");
        } else {
            System.out.println("Not Anagrams");
        }
    }
}

在Python中:

def areAnagram(a, b):
    if len(a) != len(b): return False
    count1 = [0] * 256
    count2 = [0] * 256
    for i in a:count1[ord(i)] += 1
    for i in b:count2[ord(i)] += 1

    for i in range(256):
        if(count1[i] != count2[i]):return False    

    return True


str1 = "Giniiii"
str2 = "Protijayi"
print(areAnagram(str1, str2))

让我们采取另一个着名的访谈问题:从给定的字符串中分组字谜:

public class GroupAnagrams {
    public static void main(String[] args) {
        String a = "Gini Gina Protijayi iGin aGin jayiProti Soudipta";
        Map<String, List<String>> map = Arrays.stream(a.split(" ")).collect(Collectors.groupingBy(GroupAnagrams::sortedString));
        System.out.println("MAP => " + map);
        map.forEach((k,v) -> System.out.println(k +" and the anagrams are =>" + v ));
        /*
         Look at the Map output:
        MAP => {Giin=[Gini, iGin], Paiijorty=[Protijayi, jayiProti], Sadioptu=[Soudipta], Gain=[Gina, aGin]}
        As we can see, there are multiple Lists. Hence, we have to use a flatMap(List::stream)
        Now, Look at the output:
        Paiijorty and the anagrams are =>[Protijayi, jayiProti]

        Now, look at this output:
        Sadioptu and the anagrams are =>[Soudipta]
        List contains only word. No anagrams.
        That means we have to work with map.values(). List contains all the anagrams.


        */
        String stringFromMapHavingListofLists = map.values().stream().flatMap(List::stream).collect(Collectors.joining(" "));
        System.out.println(stringFromMapHavingListofLists);
    }

    public static String sortedString(String a) {
        String sortedString = a.chars().sorted()
                .collect(StringBuilder::new, StringBuilder::appendCodePoint, StringBuilder::append).toString();

        return sortedString;

    }

    /*
     * The output : Gini iGin Protijayi jayiProti Soudipta Gina aGin
     * All the anagrams are side by side.
     */
}

答案 9 :(得分:1)

如果字符串只有ASCII字符:

  1. 创建一个256长度的数组
  2. 遍历第一个字符串并在数组中以index = ascii值的值递增计数器。当你到达字符串结尾时,还会继续计算字符以找到长度
  3. 遍历第二个字符串并减少数组中index = ascii值的计数器。如果在递减之前该值为0,则返回false,因为字符串不是字谜。另外,要跟踪第二个字符串的长度。
  4. 在字符串遍历结束时,如果两者的长度相等,则返回true,否则返回false。
  5. 如果字符串可以包含unicode字符,则使用哈希映射而不是数组来跟踪频率。其余算法保持不变。

    注意:

    1. 在向数组添加字符时计算长度可确保我们只遍历每个字符串一次。
    2. 在仅ASCII字符串的情况下使用数组根据要求优化空间。

答案 10 :(得分:1)

static bool IsAnagram(string s1, string s2)
        {

            if (s1.Length != s2.Length)
                return false;
            else
            {
                int sum1 = 0;
                for (int i = 0; i < s1.Length; i++)
                sum1 += (int)s1[i]-(int)s2[i];
                if (sum1 == 0)
                    return true;
                else
                    return false;
            }
        }

答案 11 :(得分:1)

对于已知(和小)有效字母集(例如ASCII),请使用与每个有效字母关联的计数表。第一个字符串增量计数,第二个字符串减量计数。最后遍历表以查看所有计数是否为零(字符串是字谜)或是否存在非零值(字符串不是字谜)。确保将所有字符转换为大写(或小写,全部相同)并忽略空格。

对于大量有效字母(如Unicode),请不要使用表,而是使用哈希表。它有O(1)时间来添加,查询和删除以及O(n)空间。第一个字符串递增计数的字母,第二个字符串递减计数的字母。从哈希表中删除变为零的计数。如果最后哈希表为空,则字符串为字谜。或者,只要任何计数变为负数,搜索就会以否定结果终止。

以下是C#中的详细说明和实现:Testing If Two Strings are Anagrams

答案 12 :(得分:1)

Xor'ing两个字符串怎么样?这肯定是O(n)

char* arr1="ab cde";
int n1=strlen(arr1);
char* arr2="edcb a";
int n2=strlen(arr2);
// to check for anagram;
int c=0;
int i=0, j=0;   
if(n1!=n2) 
  printf("\nNot anagram");
else {
   while(i<n1 || j<n2)
   {
       c^= ((int)arr1[i] ^ (int)arr2[j]);
       i++;
       j++;
   }
}

if(c==0) {
    printf("\nAnagram");
}
else printf("\nNot anagram");

}

答案 13 :(得分:1)

这个怎么样?

a = "lai d"
b = "di al"
sorteda = []
sortedb = []
for i in a:
    if i != " ":
        sorteda.append(i)
        if c == len(b):
            for x in b:
                c -= 1
                if x != " ":
                    sortedb.append(x)
sorteda.sort(key = str.lower)
sortedb.sort(key = str.lower)

print sortedb
print sorteda

print sortedb == sorteda

答案 14 :(得分:0)

如果两个字符串的长度相同,则if,否则字符串不是字谜。

在对每个字符的序数求和时迭代每个字符串。如果总和相等,那么字符串就是字谜。

示例:

    public Boolean AreAnagrams(String inOne, String inTwo) {

        bool result = false;

        if(inOne.Length == inTwo.Length) {

            int sumOne = 0;
            int sumTwo = 0;

            for(int i = 0; i < inOne.Length; i++) {

                sumOne += (int)inOne[i];
                sumTwo += (int)inTwo[i];
            }

            result = sumOne == sumTwo;
        }

        return result;
    }

答案 15 :(得分:0)

/* Program to find the strings are anagram or not*/
/* Author Senthilkumar M*/

Eg. 
    Anagram:
    str1 = stackoverflow
    str2 = overflowstack

    Not anagram:`enter code here`
    str1 = stackforflow
    str2 = stacknotflow

int is_anagram(char *str1, char *str2)
{
        int l1 = strlen(str1);
        int l2 = strlen(str2);
        int s1 = 0, s2 = 0;
        int i = 0;

        /* if both the string are not equal it is not anagram*/
        if(l1 != l2) {
                return 0;
        }
        /* sum up the character in the strings 
           if the total sum of the two strings is not equal
           it is not anagram */
        for( i = 0; i < l1; i++) {
                s1 += str1[i];
                s2 += str2[i];
        }
        if(s1 != s2) {
                return 0;
        }
        return 1;
}

答案 16 :(得分:0)

以下实施似乎也有效,你能查一下吗?

int histogram[256] = {0};
for (int i = 0; i < strlen(str1); ++i) {
   /* Just inc and dec every char count and
    * check the histogram against 0 in the 2nd loop */
   ++histo[str1[i]];
   --histo[str2[i]];
}

for (int i = 0; i < 256; ++i) {
   if (histo[i] != 0)
     return 0; /* not an anagram */
}

return 1; /* an anagram */

答案 17 :(得分:0)

在Swift 3中实现:

func areAnagrams(_ str1: String, _ str2: String) -> Bool {
    return dictionaryMap(forString: str1) == dictionaryMap(forString: str2)
}

func dictionaryMap(forString str: String) -> [String : Int] {
    var dict : [String : Int] = [:]
    for var i in 0..<str.characters.count {
        if let count = dict[str[i]] {
            dict[str[i]] = count + 1
        }else {
            dict[str[i]] = 1
        }
    }        
    return dict
}
//To easily subscript characters
extension String {
    subscript(i: Int) -> String {
        return String(self[index(startIndex, offsetBy: i)])
    }
}

答案 18 :(得分:0)

import java.util.ArrayList;
import java.util.Arrays;
import java.util.LinkedHashMap;
import java.util.Map;
import java.util.Scanner;

/**
 * --------------------------------------------------------------------------
 * Finding Anagrams in the given dictionary. Anagrams are words that can be
 * formed from other words Ex :The word "words" can be formed using the word
 * "sword"
 * --------------------------------------------------------------------------
 * Input : if choose option 2 first enter no of word want to compare second
 * enter word ex:
 * 
 * Enter choice : 1:To use Test Cases 2: To give input 2 Enter the number of
 * words in dictionary 
 * 6
 * viq 
 * khan
 * zee 
 * khan
 * am
 *    
 * Dictionary : [ viq khan zee khan am]
 * Anagrams 1:[khan, khan]
 * 
 */
public class Anagrams {

    public static void main(String args[]) {
        // User Input or just use the testCases
        int choice;
        @SuppressWarnings("resource")
        Scanner scan = new Scanner(System.in);
        System.out.println("Enter choice : \n1:To use Test Cases 2: To give input");
        choice = scan.nextInt();
        switch (choice) {
        case 1:
            testCaseRunner();
            break;
        case 2:
            userInput();
        default:
            break;
        }
    }

    private static void userInput() {
        @SuppressWarnings("resource")
        Scanner scan = new Scanner(System.in);
        System.out.println("Enter the number of words in dictionary");
        int number = scan.nextInt();
        String dictionary[] = new String[number];
        //
        for (int i = 0; i < number; i++) {
            dictionary[i] = scan.nextLine();
        }
        printAnagramsIn(dictionary);

    }

    /**
     * provides a some number of dictionary of words
     */
    private static void testCaseRunner() {

        String dictionary[][] = { { "abc", "cde", "asfs", "cba", "edcs", "name" },
                { "name", "mane", "string", "trings", "embe" } };
        for (int i = 0; i < dictionary.length; i++) {
            printAnagramsIn(dictionary[i]);
        }
    }

    /**
     * Prints the set of anagrams found the give dictionary
     * 
     * logic is sorting the characters in the given word and hashing them to the
     * word. Data Structure: Hash[sortedChars] = word
     */
    private static void printAnagramsIn(String[] dictionary) {
        System.out.print("Dictionary : [");// + dictionary);
        for (String each : dictionary) {
            System.out.print(each + " ");
        }
        System.out.println("]");
        //

        Map<String, ArrayList<String>> map = new LinkedHashMap<String, ArrayList<String>>();
        // review comment: naming convention: dictionary contains 'word' not
        // 'each'
        for (String each : dictionary) {
            char[] sortedWord = each.toCharArray();
            // sort dic value
            Arrays.sort(sortedWord);
            //input word
            String sortedString = new String(sortedWord);
            //
            ArrayList<String> list = new ArrayList<String>();
            if (map.keySet().contains(sortedString)) {
                list = map.get(sortedString);
            }
            list.add(each);
            map.put(sortedString, list);
        }
        // print anagram
        int i = 1;
        for (String each : map.keySet()) {
            if (map.get(each).size() != 1) {
                System.out.println("Anagrams " + i + ":" + map.get(each));
                i++;
            }
        }
    }
}

答案 19 :(得分:0)

我猜你的排序算法实际上不是O(log n),是吗?

您可以获得的最佳算法是O(n),因为您必须检查每个字符。

您可以使用两个表来存储每个单词中每个字母的计数,用O(n)填充它并将其与O(1)进行比较。

答案 20 :(得分:-1)

在java中我们也可以这样做,它的逻辑非常简单

import java.util.*;

class Anagram
{
 public static void main(String args[]) throws Exception
 {
  Boolean FLAG=true;

  Scanner sc= new Scanner(System.in);

  System.out.println("Enter 1st string");

  String s1=sc.nextLine();

  System.out.println("Enter 2nd string");

  String s2=sc.nextLine();

  int i,j;
  i=s1.length();
  j=s2.length();

  if(i==j)
  {
   for(int k=0;k<i;k++)
   {
    for(int l=0;l<i;l++)
    {
     if(s1.charAt(k)==s2.charAt(l))
     {
      FLAG=true;
      break;
     }
     else
     FLAG=false;
    }
   }
  }
  else
  FLAG=false;
  if(FLAG)
  System.out.println("Given Strings are anagrams");
  else
  System.out.println("Given Strings are not anagrams");
 }
}

答案 21 :(得分:-1)

如何转换为角色的int值并总结:

如果sum的值等于,那么它们就是彼此的字谜。

def are_anagram1(s1, s2):
    return [False, True][sum([ord(x) for x in s1]) == sum([ord(x) for x in s2])]

s1 = 'james'
s2 = 'amesj'
print are_anagram1(s1,s2)

此解决方案仅适用于'A'到'Z'和'a'到'z'。