解释使用位向量来确定所有字符是否都是唯一的

时间:2012-02-04 15:10:40

标签: java string bit-manipulation bitvector

我很困惑有点矢量如何工作(不太熟悉位向量)。这是给出的代码。有人可以带我走过这个吗?

public static boolean isUniqueChars(String str) {
    int checker = 0;
    for (int i = 0; i < str.length(); ++i) {
        int val = str.charAt(i) - 'a';
        if ((checker & (1 << val)) > 0) return false;
        checker |= (1 << val);
    }
    return true;
}

特别是checker在做什么?

12 个答案:

答案 0 :(得分:176)

我怀疑你从我正在阅读的同一本书中得到了这个代码......这里的代码本身并不像运算符 - | =,&amp;和&lt;&lt;我们通常不会使用这些人 - 作者并没有花费额外的时间来解释过程,也没有涉及这里涉及的实际机制。我在开始时对这个帖子的前一个答案感到满意,但只是在抽象层面上。我回过头来因为我觉得需要更具体的解释 - 缺乏一个总是让我感到不安。

此运算符&lt;&lt;是一个左按位移位器,它采用该数字或操作数的二进制表示,并将其移动到操作数或数字指定的许多位置,如十进制数字中的十进制数字。我们乘以基数2 - 当我们向上移动时,然而许多地方不是基数10-因此右边的数字是指数而左边的数字是2的基数倍。

这个运算符| =取左边的操作数和/或右边的操作数 - 这一个 - '&amp;'和左边和右边两个操作数的位。

所以我们这里有一个哈希表,每次检查器得到或者(checker |= (1 << val))时,它都以32位二进制数存储,并带有一个字母的指定二进制值,它的对应位是被设置为真。 字符的值与检查器(checker & (1 << val)) > 0)一致 - 如果它大于0,我们知道我们有一个欺骗 - 因为两个相同的位设置为true并且一起将返回true或'1''

有26个二进制位置,每个位置对应一个小写字母 - 作者确实假设该字符串只包含小写字母 - 这是因为我们只剩下6个(在32位整数中)剩余的位置 - 而且我们得到了碰撞

00000000000000000000000000000001 a 2^0

00000000000000000000000000000010 b 2^1

00000000000000000000000000000100 c 2^2

00000000000000000000000000001000 d 2^3

00000000000000000000000000010000 e 2^4

00000000000000000000000000100000 f 2^5

00000000000000000000000001000000 g 2^6

00000000000000000000000010000000 h 2^7

00000000000000000000000100000000 i 2^8

00000000000000000000001000000000 j 2^9

00000000000000000000010000000000 k 2^10

00000000000000000000100000000000 l 2^11

00000000000000000001000000000000 m 2^12

00000000000000000010000000000000 n 2^13

00000000000000000100000000000000 o 2^14

00000000000000001000000000000000 p 2^15

00000000000000010000000000000000 q 2^16

00000000000000100000000000000000 r 2^17

00000000000001000000000000000000 s 2^18

00000000000010000000000000000000 t 2^19

00000000000100000000000000000000 u 2^20

00000000001000000000000000000000 v 2^21

00000000010000000000000000000000 w 2^22

00000000100000000000000000000000 x 2^23

00000001000000000000000000000000 y 2^24

00000010000000000000000000000000 z 2^25

因此,对于输入字符串'azya',我们一步一步地移动

字符串'a'

a      =00000000000000000000000000000001
checker=00000000000000000000000000000000

checker='a' or checker;
// checker now becomes = 00000000000000000000000000000001
checker=00000000000000000000000000000001

a and checker=0 no dupes condition

string'az'

checker=00000000000000000000000000000001
z      =00000010000000000000000000000000

z and checker=0 no dupes 

checker=z or checker;
// checker now becomes 00000010000000000000000000000001  

string'azy'

checker= 00000010000000000000000000000001    
y      = 00000001000000000000000000000000 

checker and y=0 no dupes condition 

checker= checker or y;
// checker now becomes = 00000011000000000000000000000001

string'azya'

checker= 00000011000000000000000000000001
a      = 00000000000000000000000000000001

a and checker=1 we have a dupe

现在,它声明了重复

答案 1 :(得分:85)

这里使用

int checker作为位存储。整数值中的每个位都可以视为一个标志,因此最终int是一个位数组(标志)。代码中的每个位都表明是否在字符串中找到了具有位索引的字符。您可以出于同样的原因使用位向量而不是int。它们之间有两个不同之处:

  • <强>尺寸即可。 int具有固定大小,通常为4个字节,这意味着8 * 4 = 32位(标志)。位向量通常可以是不同的大小,或者您应该在构造函数中指定大小。

  • <强> API 即可。使用位向量,您将更容易阅读代码,可能是这样的:

    vector.SetFlag(4, true); // set flag at index 4 as true

    对于int,您将拥有较低级别的位逻辑代码:

    checker |= (1 << 5); // set flag at index 5 to true

同样可能int可能会快一点,因为带位的操作非常低,可以由CPU按原样执行。 BitVector允许编写一些不那么神秘的代码,而且它可以存储更多的标志。

供将来参考:位向量也称为bitSet或bitArray。以下是针对不同语言/平台的数据结构的一些链接:

答案 2 :(得分:28)

我还假设您的示例来自书籍Cracking The Code Interview,我的回答与此背景相关。

为了使用这种算法来解决问题,我们不得不承认我们只会将字符从a传递给z(小写)。

由于只有26个字母,并且这些字母在我们使用的编码表中正确排序,这保证了所有潜在的差异str.charAt(i) - 'a'将低于32(int变量{{1}的大小})。

正如Snowbear所解释的那样,我们将使用checker变量作为位数组。让我们举例说明:

让我们说吧 checker

  • 第一遍(i = t)
  

checker == 0(00000000000000000000000000000000)

str equals "test"
  • 第二遍(i = e)
  

checker == 524288(00000000000010000000000000000000)

In ASCII, val = str.charAt(i) - 'a' = 116 - 97 = 19
What about 1 << val ?
1          == 00000000000000000000000000000001
1 << 19    == 00000000000010000000000000000000
checker |= (1 << val) means checker = checker | (1 << val)
so checker = 00000000000000000000000000000000 | 00000000000010000000000000000000
checker == 524288 (00000000000010000000000000000000)

依此类推..直到我们通过条件

找到特定字符的检查器中已经设置的位
val = 101 - 97 = 4
1          == 00000000000000000000000000000001
1 << 4     == 00000000000000000000000000010000
checker |= (1 << val) 
so checker = 00000000000010000000000000000000 | 00000000000000000000000000010000
checker == 524304 (00000000000010000000000000010000)

希望有所帮助

答案 3 :(得分:21)

我认为所有这些答案都解释了它是如何工作的,但是我想通过重命名一些变量,添加其他变量并添加注释来提供我对如何更好地看待它的意见:

public static boolean isUniqueChars(String str) {

    /*
    checker is the bit array, it will have a 1 on the character index that
    has appeared before and a 0 if the character has not appeared, you
    can see this number initialized as 32 0 bits:
    00000000 00000000 00000000 00000000
     */
    int checker = 0;

    //loop through each String character
    for (int i = 0; i < str.length(); ++i) {
        /*
        a through z in ASCII are charactets numbered 97 through 122, 26 characters total
        with this, you get a number between 0 and 25 to represent each character index
        0 for 'a' and 25 for 'z'

        renamed 'val' as 'characterIndex' to be more descriptive
         */
        int characterIndex = str.charAt(i) - 'a'; //char 'a' would get 0 and char 'z' would get 26

        /*
        created a new variable to make things clearer 'singleBitOnPosition'

        It is used to calculate a number that represents the bit value of having that 
        character index as a 1 and the rest as a 0, this is achieved
        by getting the single digit 1 and shifting it to the left as many
        times as the character index requires
        e.g. character 'd'
        00000000 00000000 00000000 00000001
        Shift 3 spaces to the left (<<) because 'd' index is number 3
        1 shift: 00000000 00000000 00000000 00000010
        2 shift: 00000000 00000000 00000000 00000100
        3 shift: 00000000 00000000 00000000 00001000

        Therefore the number representing 'd' is
        00000000 00000000 00000000 00001000

         */
        int singleBitOnPosition = 1 << characterIndex;

        /*
        This peforms an AND between the checker, which is the bit array
        containing everything that has been found before and the number
        representing the bit that will be turned on for this particular
        character. e.g.
        if we have already seen 'a', 'b' and 'd', checker will have:
        checker = 00000000 00000000 00000000 00001011
        And if we see 'b' again:
        'b' = 00000000 00000000 00000000 00000010

        it will do the following:
        00000000 00000000 00000000 00001011
        & (AND)
        00000000 00000000 00000000 00000010
        -----------------------------------
        00000000 00000000 00000000 00000010

        Since this number is different than '0' it means that the character
        was seen before, because on that character index we already have a 
        1 bit value
         */
        if ((checker & singleBitOnPosition) > 0) {
            return false;
        }

        /* 
        Remember that 
        checker |= singleBitOnPosition is the same as  
        checker = checker | singleBitOnPosition
        Sometimes it is easier to see it expanded like that.

        What this achieves is that it builds the checker to have the new 
        value it hasnt seen, by doing an OR between checker and the value 
        representing this character index as a 1. e.g.
        If the character is 'f' and the checker has seen 'g' and 'a', the 
        following will happen

        'f' = 00000000 00000000 00000000 00100000
        checker(seen 'a' and 'g' so far) = 00000000 00000000 00000000 01000001

        00000000 00000000 00000000 00100000
        | (OR)
        00000000 00000000 00000000 01000001
        -----------------------------------
        00000000 00000000 00000000 01100001

        Therefore getting a new checker as 00000000 00000000 00000000 01100001

         */
        checker |= singleBitOnPosition;
    }
    return true;
}

答案 4 :(得分:6)

阅读伊万的答案确实对我有所帮助,尽管我会用不同的方式说出来。

<<中的(1 << val)是一个移位运算符。它需要1(在二进制中表示为000000001,具有您喜欢/由内存分配的前置零)并将其向左移动val个空格。由于我们每次只假设az并减去a,因此每个字母的值都为0-25,这将是checker整数布尔表示中右边的字母索引,因为我们会在1 checker次内将val移到左侧。

在每次检查结束时,我们会看到|=运算符。如果在该索引的任一操作数中存在0,则会合并两个二进制数,将所有1替换为1。在这里,这意味着1中存在(1 << val)的任何地方,1将被复制到checker,而所有checker的现有1将会被保留。

正如你可能猜到的那样,1在这里作为true的布尔标志。当我们检查一个字符是否已经在字符串中表示时,我们比较checker,此时它本质上是一个布尔标志数组(1值)已经在已经存在的字符索引处已经被表示,本质上是一个布尔值数组,在当前字符的索引处有一个1标志。

&运算符完成此检查。与|=类似,如果两个操作数在该索引处都有&,则1运算符将仅复制1 。因此,基本上,只有checker中已经存在且也会在(1 << val)中表示的标记才会被复制。在这种情况下,这意味着只有当前字符已经被表示时,1的结果中才会出现checker & (1 << val)。如果在该操作的结果中存在1,则返回的布尔值为> 0,并且该方法返回false。

我猜,为什么位向量也称为位数组。因为,即使它们不是数组数据类型,也可以使用类似于数组的方式来存储布尔标志。

答案 5 :(得分:5)

上面已经提供了几个优秀的答案。所以我不想重复已经说过的一切。但是我确实希望添加一些东西来帮助完成上述程序,因为我刚刚完成了相同的程序并且有几个问题,但是花了一些时间后,我对这个程序有了更多的了解。

首先&#34;检查&#34;用于跟踪已在String中遍历的字符,以查看是否有任何字符被重复。

现在&#34;检查&#34;是一个int数据类型,因此它只能有32位或4个字节(取决于平台),因此该程序只能在32个字符范围内的字符集中正常工作。这就是原因,这个程序减去了一个&#39; a&#39;从每个字符开始,以使该程序仅运行小写字符。但是,如果你混合使用大小写字符,那么它将不起作用。

顺便说一下,如果你没有减去&#39;从每个字符(见下面的语句)然后这个程序将只适用于具有大写字符的字符串或仅具有小写字符的字符串。因此,上述程序的范围也从小写字符增加到大写字符,但它们不能混合在一起。

int val = str.charAt(i) - 'a'; 

但是我想使用Bitwise Operation编写一个通用程序,该程序适用于任何ASCII字符而不必担心大写,小写,数字或任何特殊字符。为了做到这一点,我们的&#34;检查&#34;应该足够大,可以存储256个字符(ASCII字符集大小)。但是Java中的int不起作用,因为它只能存储32位。因此,在下面的程序中,我使用JDK中可用的BitSet类,它可以在实例化BitSet对象时传递任何用户定义的大小。

这是一个与上面使用Bitwise运算符编写的程序完全相同的程序,但是这个程序适用于带有ASCII字符集的任何字符的字符串。

public static boolean isUniqueStringUsingBitVectorClass(String s) {

    final int ASCII_CHARACTER_SET_SIZE = 256;

    final BitSet tracker = new BitSet(ASCII_CHARACTER_SET_SIZE);

    // if more than  256 ASCII characters then there can't be unique characters
    if(s.length() > 256) {
        return false;
    }

    //this will be used to keep the location of each character in String
    final BitSet charBitLocation = new BitSet(ASCII_CHARACTER_SET_SIZE);

    for(int i = 0; i < s.length(); i++) {

        int charVal = s.charAt(i);
        charBitLocation.set(charVal); //set the char location in BitSet

        //check if tracker has already bit set with the bit present in charBitLocation
        if(tracker.intersects(charBitLocation)) {
            return false;
        }

        //set the tracker with new bit from charBitLocation
        tracker.or(charBitLocation);

        charBitLocation.clear(); //clear charBitLocation to store bit for character in the next iteration of the loop

    }

    return true;

}

答案 6 :(得分:4)

简单说明(使用下面的JS代码)

  • 每个机器代码的整数变量是 32位数组
  • 所有位操作均为32-bit
  • 他们不知道OS / CPU架构或所选择的语言编号系统,例如: JS的DEC64
  • 此重复查找方法类似于将字符存储在大小为32的数组中,如果我们在字符串0th中找到a,我们会设置1st索引对于b&amp;等等。
  • 字符串中的重复字符将占用相应的位,或者,在本例中,设置为1.
  • Ivan has already explained: How this index calculation works in this previous answer

运营摘要:

  • checker&amp;之间执行 AND 操作角色的index
  • 内部都是Int-32-Arrays
  • 这两者之间的比较顺利。
  • 检查if操作的输出是1
  • 如果output == 1
    • checker变量在两个数组中都设置了特定索引位
    • 因此它是重复的。
  • 如果output == 0
    • 到目前为止还没有找到这个角色
    • checker&amp;之间执行 OR 操作角色的index
    • 因此,将索引位更新为1
    • 将输出分配到checker

<强>假设:

  • 我们假设我们将获得所有小写字符
  • 而且,32号就够了
  • 因此,考虑到a ascii 代码为97
  • ,我们开始从 96作为参考点开始计算索引

以下是 JavaScript 源代码。

function checkIfUniqueChars (str) {

    var checker = 0; // 32 or 64 bit integer variable 

    for (var i = 0; i< str.length; i++) {
        var index = str[i].charCodeAt(0) - 96;
        var bitRepresentationOfIndex = 1 << index;

        if ( (checker & bitRepresentationOfIndex) > 1) {
            console.log(str, false);
            return false;
        } else {
            checker = (checker | bitRepresentationOfIndex);
        }
    }
    console.log(str, true);
    return true;
}

checkIfUniqueChars("abcdefghi");  // true
checkIfUniqueChars("aabcdefghi"); // false
checkIfUniqueChars("abbcdefghi"); // false
checkIfUniqueChars("abcdefghii"); // false
checkIfUniqueChars("abcdefghii"); // false

注意在JS中,尽管整数是64位,但总是在32位上进行逐位操作。

示例: 如果字符串是aa,那么:

// checker is intialized to 32-bit-Int(0)
// therefore, checker is
checker= 00000000000000000000000000000000

i = 0

str[0] is 'a'
str[i].charCodeAt(0) - 96 = 1

checker 'AND' 32-bit-Int(1) = 00000000000000000000000000000000
Boolean(0) == false

// So, we go for the '`OR`' operation.

checker = checker OR 32-bit-Int(1)
checker = 00000000000000000000000000000001

i = 1

str[1] is 'a'
str[i].charCodeAt(0) - 96 = 1

checker= 00000000000000000000000000000001
a      = 00000000000000000000000000000001

checker 'AND' 32-bit-Int(1) = 00000000000000000000000000000001
Boolean(1) == true
// We've our duplicate now

答案 7 :(得分:3)

让我们逐行细分代码。

int checker = 0; 我们正在启动一个检查器,它将帮助我们找到重复的值。

int val = str.charAt(i) - &#39; a&#39;; 我们在第#个位置获取角色的ASCII值字符串并用ASCII值&#39; a&#39;减去它。由于假设字符串只是较低的字符,因此字符数限制为26. Hece,&#39; val&#39;将始终为&gt; = 0。

if((checker&amp;(1&lt;&lt; val))&gt; 0)返回false;

检查员| =(1&lt;&lt;&lt; val);

现在这是棘手的部分。让我们考虑使用string&#34; abcda&#34;的示例。这应该理想地返回false。

For循环迭代1:

Checker:00000000000000000000000000000000

val:97-97 = 0

1&lt;&lt; 0:00000000000000000000000000000001

检查员&amp; (1&lt;&lt;&lt; val):00000000000000000000000000000000不是&gt; 0

因此检查员:00000000000000000000000000000001

For循环迭代2:

Checker:00000000000000000000000000000001

val:98-97 = 1

1&lt;&lt; 0:00000000000000000000000000000010

检查员&amp; (1&lt;&lt;&lt; val):00000000000000000000000000000000不是&gt; 0

因此检查员:00000000000000000000000000000011

For循环迭代3:

Checker:00000000000000000000000000000011

val:99-97 = 0

1&lt;&lt; 0:00000000000000000000000000000100

检查员&amp; (1&lt;&lt;&lt; val):00000000000000000000000000000000不是&gt; 0

因此检查员:00000000000000000000000000000111

For循环迭代4:

Checker:00000000000000000000000000000111

val:100-97 = 0

1&lt;&lt; 0:00000000000000000000000000001000

检查员&amp; (1&lt;&lt;&lt; val):00000000000000000000000000000000不是&gt; 0

因此检查员:00000000000000000000000000001111

For循环迭代5:

Checker:00000000000000000000000000001111

val:97-97 = 0

1&lt;&lt; 0:00000000000000000000000000000001

检查员&amp; (1&lt;&lt;&lt; val):00000000000000000000000000000001&gt; 0

因此返回false。

答案 8 :(得分:2)

public static void main (String[] args)
{
    //In order to understand this algorithm, it is necessary to understand the following:

    //int checker = 0;
    //Here we are using the primitive int almost like an array of size 32 where the only values can be 1 or 0
    //Since in Java, we have 4 bytes per int, 8 bits per byte, we have a total of 4x8=32 bits to work with

    //int val = str.charAt(i) - 'a';
    //In order to understand what is going on here, we must realize that all characters have a numeric value
    for (int i = 0; i < 256; i++)
    {
        char val = (char)i;
        System.out.print(val);
    }

    //The output is something like:
    //             !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ
    //There seems to be ~15 leading spaces that do not copy paste well, so I had to use real spaces instead

    //To only print the characters from 'a' on forward:
    System.out.println();
    System.out.println();

    for (int i=0; i < 256; i++)
    {
        char val = (char)i;
        //char val2 = val + 'a'; //incompatible types. required: char found: int
        int val2 = val + 'a';  //shift to the 'a', we must use an int here otherwise the compiler will complain
        char val3 = (char)val2;  //convert back to char. there should be a more elegant way of doing this.
        System.out.print(val3);
    }

    //Notice how the following does not work:
    System.out.println();
    System.out.println();

    for (int i=0; i < 256; i++)
    {
        char val = (char)i;
        int val2 = val - 'a';
        char val3 = (char)val2;
        System.out.print(val3);
    }
    //I'm not sure why this spills out into 2 lines:
    //EDIT I cant seem to copy this into stackoverflow!

    System.out.println();
    System.out.println();

    //So back to our original algorithm:
    //int val = str.charAt(i) - 'a';
    //We convert the i'th character of the String to a character, and shift it to the right, since adding shifts to the right and subtracting shifts to the left it seems

    //if ((checker & (1 << val)) > 0) return false;
    //This line is quite a mouthful, lets break it down:
    System.out.println(0<<0);
    //00000000000000000000000000000000
    System.out.println(0<<1);
    //00000000000000000000000000000000
    System.out.println(0<<2);
    //00000000000000000000000000000000
    System.out.println(0<<3);
    //00000000000000000000000000000000
    System.out.println(1<<0);
    //00000000000000000000000000000001
    System.out.println(1<<1);
    //00000000000000000000000000000010 == 2
    System.out.println(1<<2);
    //00000000000000000000000000000100 == 4
    System.out.println(1<<3);
    //00000000000000000000000000001000 == 8
    System.out.println(2<<0);
    //00000000000000000000000000000010 == 2
    System.out.println(2<<1);
    //00000000000000000000000000000100 == 4
    System.out.println(2<<2);
    // == 8
    System.out.println(2<<3);
    // == 16
    System.out.println("3<<0 == "+(3<<0));
    // != 4 why 3???
    System.out.println(3<<1);
    //00000000000000000000000000000011 == 3
    //shift left by 1
    //00000000000000000000000000000110 == 6
    System.out.println(3<<2);
    //00000000000000000000000000000011 == 3
    //shift left by 2
    //00000000000000000000000000001100 == 12
    System.out.println(3<<3);
    // 24

    //It seems that the -  'a' is not necessary
    //Back to if ((checker & (1 << val)) > 0) return false;
    //(1 << val means we simply shift 1 by the numeric representation of the current character
    //the bitwise & works as such:
    System.out.println();
    System.out.println();
    System.out.println(0&0);    //0
    System.out.println(0&1);       //0
    System.out.println(0&2);          //0
    System.out.println();
    System.out.println();
    System.out.println(1&0);    //0
    System.out.println(1&1);       //1
    System.out.println(1&2);          //0
    System.out.println(1&3);             //1
    System.out.println();
    System.out.println();
    System.out.println(2&0);    //0
    System.out.println(2&1);       //0   0010 & 0001 == 0000 = 0
    System.out.println(2&2);          //2  0010 & 0010 == 2
    System.out.println(2&3);             //2  0010 & 0011 = 0010 == 2
    System.out.println();
    System.out.println();
    System.out.println(3&0);    //0    0011 & 0000 == 0
    System.out.println(3&1);       //1  0011 & 0001 == 0001 == 1
    System.out.println(3&2);          //2  0011 & 0010 == 0010 == 2, 0&1 = 0 1&1 = 1
    System.out.println(3&3);             //3 why?? 3 == 0011 & 0011 == 3???
    System.out.println(9&11);   // should be... 1001 & 1011 == 1001 == 8+1 == 9?? yay!

    //so when we do (1 << val), we take 0001 and shift it by say, 97 for 'a', since any 'a' is also 97

    //why is it that the result of bitwise & is > 0 means its a dupe?
    //lets see..

    //0011 & 0011 is 0011 means its a dupe
    //0000 & 0011 is 0000 means no dupe
    //0010 & 0001 is 0011 means its no dupe
    //hmm
    //only when it is all 0000 means its no dupe

    //so moving on:
    //checker |= (1 << val)
    //the |= needs exploring:

    int x = 0;
    int y = 1;
    int z = 2;
    int a = 3;
    int b = 4;
    System.out.println("x|=1 "+(x|=1));  //1
    System.out.println(x|=1);     //1
    System.out.println(x|=1);      //1
    System.out.println(x|=1);       //1
    System.out.println(x|=1);       //1
    System.out.println(y|=1); // 0001 |= 0001 == ?? 1????
    System.out.println(y|=2); // ??? == 3 why??? 0001 |= 0010 == 3... hmm
    System.out.println(y);  //should be 3?? 
    System.out.println(y|=1); //already 3 so... 0011 |= 0001... maybe 0011 again? 3?
    System.out.println(y|=2); //0011 |= 0010..... hmm maybe.. 0011??? still 3? yup!
    System.out.println(y|=3); //0011 |= 0011, still 3
    System.out.println(y|=4);  //0011 |= 0100.. should be... 0111? so... 11? no its 7
    System.out.println(y|=5);  //so we're at 7 which is 0111, 0111 |= 0101 means 0111 still 7
    System.out.println(b|=9); //so 0100 |= 1001 is... seems like xor?? or just or i think, just or... so its 1101 so its 13? YAY!

    //so the |= is just a bitwise OR!
}

public static boolean isUniqueChars(String str) {
    int checker = 0;
    for (int i = 0; i < str.length(); ++i) {
        int val = str.charAt(i) - 'a';  //the - 'a' is just smoke and mirrors! not necessary!
        if ((checker & (1 << val)) > 0) return false;
        checker |= (1 << val);
    }
    return true;
}

public static boolean is_unique(String input)
{
    int using_int_as_32_flags = 0;
    for (int i=0; i < input.length(); i++)
    {
        int numeric_representation_of_char_at_i = input.charAt(i);
        int using_0001_and_shifting_it_by_the_numeric_representation = 1 << numeric_representation_of_char_at_i; //here we shift the bitwise representation of 1 by the numeric val of the character
        int result_of_bitwise_and = using_int_as_32_flags & using_0001_and_shifting_it_by_the_numeric_representation;
        boolean already_bit_flagged = result_of_bitwise_and > 0;              //needs clarification why is it that the result of bitwise & is > 0 means its a dupe?
        if (already_bit_flagged)
            return false;
        using_int_as_32_flags |= using_0001_and_shifting_it_by_the_numeric_representation;
    }
    return true;
}

答案 9 :(得分:0)

以前的帖子很好地解释了代码块的作用,我想使用BitSet java数据结构添加我的简单解决方案:

private static String isUniqueCharsUsingBitSet(String string) {
  BitSet bitSet =new BitSet();
    for (int i = 0; i < string.length(); ++i) {
        int val = string.charAt(i);
        if(bitSet.get(val)) return "NO";
        bitSet.set(val);
    }
  return "YES";
}

答案 10 :(得分:0)

Line 1:   public static boolean isUniqueChars(String str) {
Line 2:      int checker = 0;
Line 3:      for (int i = 0; i < str.length(); ++i) {
Line 4:          int val = str.charAt(i) - 'a';
Line 5:          if ((checker & (1 << val)) > 0) return false;
Line 6:         checker |= (1 << val);
Line 7:      }
Line 8:      return true;
Line 9:   }

我了解使用Javascript的方式。假设输入var inputChar = "abca"; //find if inputChar has all unique characters

让我们开始

Line 4: int val = str.charAt(i) - 'a';

上一行在inputChar中找到第一个字符的二进制值,即ascii中的 a a = 97 ,然后将97转换为二进制成为 1100001

在Javascript中,例如:"a".charCodeAt().toString(2)返回1100001

checker = 0 //二进制32位表示形式= 0000000000000000000000000

checker = 1100001 | checker; //检查器变成1100001(以32位表示,它变成000000000 ..... 00001100001)

但是我希望我的位掩码(int checker)仅设置一位,但是检查器是1100001

Line 4:          int val = str.charAt(i) - 'a';

现在上面的代码很方便。我总是总是减去97(a的ASCII值)

val = 0; // 97 - 97  Which is  a - a
val = 1; // 98 - 97 Which is b - a
val = 1;  // 99 - 97 Which is c - a

让我们使用重置的val

第5行和第6行在@Ivan答案中得到了很好的解释

答案 11 :(得分:0)

以防万一有人使用位向量在字符串中寻找等同于kotlin的唯一字符

fun isUnique(str: String): Boolean {
    var checker = 0
    for (i in str.indices) {
        val bit = str.get(i) - 'a'
        if (checker.and(1 shl bit) > 0) return false
        checker = checker.or(1 shl bit)
    }
    return true
}

参考:https://www.programiz.com/kotlin-programming/bitwise