Question

我刚刚意识到使用Stream.reduce(...)无法实现以下算法来计算流的哈希码。问题是哈希码的初始种子是<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="2.0"> <xsl:output method="text"/> <xsl:strip-space elements="*"/> <xsl:template match="root"> <xsl:text>{
 "root":
 {
</xsl:text> <xsl:for-each-group select="Metric" group-by="@measObjLdn"> <xsl:sort select="current-grouping-key()"/> <xsl:variable name="curr_key" select="current-grouping-key()"/> <xsl:text> "Table":
 [
</xsl:text> <xsl:for-each-group select="current-group()" group-by="@TableName"> <xsl:sort select="current-grouping-key()"/> <xsl:if test="current-grouping-key() != ''"> <xsl:text> {
 "TableName":"</xsl:text> <xsl:value-of select="current-grouping-key()"/> <xsl:text>",
</xsl:text> <xsl:text> "Stamp":"</xsl:text> <xsl:value-of select="@endTime"/> <xsl:text>",
</xsl:text> <xsl:text> "measObjLdn":"</xsl:text> <xsl:value-of select="$curr_key"/> <xsl:text>",
</xsl:text> <xsl:text> "Element_Type":"</xsl:text> <xsl:value-of select="@Element_Type"/> <xsl:text>",
</xsl:text> <xsl:text> "Key1":"</xsl:text> <xsl:value-of select="@Key1"/> <xsl:text>"</xsl:text> <xsl:for-each select="current-group()"> <xsl:for-each select="@*[starts-with(name(), 'H')]"> <xsl:text>,
</xsl:text> <xsl:text> "</xsl:text> <xsl:value-of select="name()"/> <xsl:text>":"</xsl:text> <xsl:value-of select="number(.)"/> <xsl:text>"</xsl:text> </xsl:for-each> </xsl:for-each> <xsl:text>
 }
</xsl:text> </xsl:if> </xsl:for-each-group> <xsl:text> ]
</xsl:text> </xsl:for-each-group> <xsl:text> }
}</xsl:text> </xsl:template> </xsl:stylesheet>，它不是累加器的标识。

List.hashCode()的算法：

您可能会认为以下内容是正确的，但事实并非如此，但如果流处理未分割，它将起作用。

int hashCode = 1;
for (E e : list)
  hashCode = 31*hashCode + (e==null ? 0 : e.hashCode());

似乎唯一合理的做法是获取List<Object> list = Arrays.asList(1,null, new Object(),4,5,6); int hashCode = list.stream().map(Objects::hashCode).reduce(1, (a, b) -> 31 * a + b);的{{1}}并进行正常的顺序处理或首先将其收集到Iterator。

Answer 1

虽然乍一看，哈希码算法由于其非关联性似乎是不可并行化的，但如果我们转换函数，它是可能的：

a * 31 * 31 * 31 + b * 31 * 31 + c * 31 + d

到

a * 31³ + b * 31² + c * 31¹ + d * 31⁰

基本上是

List

或大小为n的任意1 * 31ⁿ + e₀ * 31ⁿ⁻¹ + e₁ * 31ⁿ⁻² + e₂ * 31ⁿ⁻³ + … + eₙ₋₃ * 31² + eₙ₋₂ * 31¹ + eₙ₋₁ * 31⁰：

第一个eₓ是原始算法的初始值，x是索引static <T> Collector<T,?,Integer> hashing() { return Collector.of(() -> new int[2], (a,o) -> { a[0]=a[0]*31+Objects.hashCode(o); a[1]++; }, (a1, a2) -> { a1[0]=a1[0]*iPow(31,a2[1])+a2[0]; a1[1]+=a2[1]; return a1; }, a -> iPow(31,a[1])+a[0]); } // derived from http://stackoverflow.com/questions/101439 private static int iPow(int base, int exp) { int result = 1; for(; exp>0; exp >>= 1, base *= base) if((exp & 1)!=0) result *= base; return result; }的列表元素的哈希码。虽然求和现在是独立于评估顺序的，但显然存在对元素位置的依赖性，我们可以通过首先在索引上流式传输来解决这个问题，这对于随机访问列表和数组是有效的，或者通常使用跟踪的收集器来解决遇到的对象数量。收集器可以求助于重复的乘法累加，并且必须求助于幂函数才能合并结果：

List<Object> list = Arrays.asList(1,null, new Object(),4,5,6);
int expected = list.hashCode();

int hashCode = list.stream().collect(hashing());
if(hashCode != expected)
    throw new AssertionError();

// works in parallel
hashCode = list.parallelStream().collect(hashing());
if(hashCode != expected)
    throw new AssertionError();

// a method avoiding auto-boxing is more complicated:
int[] result=list.parallelStream().mapToInt(Objects::hashCode)
    .collect(() -> new int[2],
    (a,o)    -> { a[0]=a[0]*31+Objects.hashCode(o); a[1]++; },
    (a1, a2) -> { a1[0]=a1[0]*iPow(31,a2[1])+a2[0]; a1[1]+=a2[1]; });
hashCode = iPow(31,result[1])+result[0];

if(hashCode != expected)
    throw new AssertionError();

// random access lists allow a better solution:
hashCode = IntStream.range(0, list.size()).parallel()
    .map(ix -> Objects.hashCode(list.get(ix))*iPow(31, list.size()-ix-1))
    .sum() + iPow(31, list.size());

if(hashCode != expected)
    throw new AssertionError();

//*[@required_distribution and not(contains(@required_distribution, 'internal'))]

Answer 2

作为第一种方法，只要您没有性能问题，我就会使用collect-to-a-list解决方案。这样你就可以避免重新实现轮子，并且如果有一天哈希算法改变了你会从中受益，如果流是并行化的，你也是安全的（即使我不确定这是真正的问题）。

我实现它的方式可能会有所不同，具体取决于您需要比较不同数据结构的方式和时间（让我们称之为Foo）。

如果你手动完成并且很简单，一个简单的静态函数就足够了：

public static int computeHash(Foo origin, Collection<Function<Foo, ?>> selectors) {
    return selectors.stream()
            .map(f -> f.apply(origin))
            .collect(Collectors.toList())
            .hashCode();
}

并像这样使用

if(computeHash(foo1, selectors) == computeHash(foo2, selectors)) { ... }

但是，如果Foo的实例本身存储在Collection中，并且您需要同时hashCode()和equals()（来自Object），我将它包装在FooEqualable：

中

public final class FooEqualable {
    private final Foo origin;
    private final Collection<Function<Foo, ?>> selectors;

    public FooEqualable(Foo origin, Collection<Function<Foo, ?>> selectors) {
        this.origin = origin;
        this.selectors = selectors;
    }

    @Override
    public int hashCode() {
        return selectors.stream()
                .map(f -> f.apply(origin))
                .collect(Collectors.toList())
                .hashCode();
    }

    @Override
    public boolean equals(Object obj) {
        if (obj instanceof FooEqualable) {
            FooEqualable that = (FooEqualable) obj;

            Object[] a1 = selectors.stream().map(f -> f.apply(this.origin)).toArray();
            Object[] a2 = selectors.stream().map(f -> f.apply(that.origin)).toArray();

            return Arrays.equals(a1, a2);
        }
        return false;
    }
}

我完全清楚，如果对hashCode()和equals()进行多次调用，此解决方案未经过优化（性能方面），但我倾向于不进行优化，除非它成为一个问题。

Answer 3

Holger写了正确的solution，如果你想要一个简单的方法，还有两种可能性：

1。收集到`List`并致电`hashCode()`

Stream<? extends Object> stream;
int hashCode = stream.collect(toList()).hashCode();

2。使用`Stream.iterator()`

Stream<? extends Object> stream;
Iterator<? extends Object> iter = stream.iterator();
int hashCode = 1;
while(iter.hasNext()) {
  hashCode = 31 *hashCode + Objects.hashCode(iter.next());
}

提醒一下List.hashCode()使用的算法：

int hashCode = 1;
for (E e : list)
  hashCode = 31*hashCode + (e==null ? 0 : e.hashCode());

Answer 4

我发现最简单，最快捷的方法是使用Collector实现Collectors.reducing：

/**
 * Creates a new Collector that collects the hash code of the elements.
 * @param <T> the type of the input elements
 * @return the hash code
 * @see Arrays#hashCode(java.lang.Object[])
 * @see AbstractList#hashCode()
 */
public static <T> Collector<T, ?, Integer> toHashCode() {
    return Collectors.reducing(1, Objects::hashCode, (i, j) -> 31 *  i + j);
}

@Test
public void testHashCode() {
    List<?> list = Arrays.asList(Math.PI, 42, "stackoverflow.com");
    int expected = list.hashCode();
    int actual = list.stream().collect(StreamUtils.toHashCode());
    assertEquals(expected, actual);
}

如何以与List.hashCode（）相同的方式计算流的哈希码

4 个答案:

1。收集到`List`并致电`hashCode()`

2。使用`Stream.iterator()`

如何以与List.hashCode（）相同的方式计算流的哈希码

4 个答案:

1。收集到List并致电hashCode()

2。使用Stream.iterator()

1。收集到`List`并致电`hashCode()`

2。使用`Stream.iterator()`