Question

我有一个典型的模式搜索问题，我需要确定多个模式出现在数组中的位置并将它们单独输出。

ex：if(result.request_id){ return hp_unit_price.findAll({ where:{ unit_price_id:result.unit_price_id, hp_property_id:result.property_id, hp_unit_details_id:result.unit_details_id } }).then(function (result){ if(result.is_resale_unit==0 && result.sold_out==0){ return Sequelize.query('UPDATE hp_unit_price SET resale_unit_status=1 WHERE hp_unit_details_id='+result.unit_details_id+' and hp_property_id='+result.property_id) } }) }

函数应返回

['horse', 'camel', 'horse', 'camel', 'tiger', 'horse', 'camel', 'horse', 'camel']

即。找到可以成为子数组的数组中重复的模式，

或另一种定义方式是 - ＆gt;找到主阵列中出现次数超过1次的所有子阵列。

即。结果数组应该有['horse', 'camel'], ['horse', 'camel', 'horse'], ['camel', 'horse', 'camel'], ['horse', 'camel', 'horse', 'camel'] - ＆gt;

length > 1 =＆gt; [1, 2, 3, 1, 2, 1, 4, 5]和[1,2,3]都是子数组，[1,4,5]是重复/重复子数组[1,2,3]

寻找合适的高效算法，而不是暴力循环解决方案。

Answer 1

这可能不是你想要的，但我不知道你尝试了什么，所以也许它可能有用。这是我的直接方法，可能属于你的“暴力循环解决方案”，但我想试一试，因为没有人发布完整答案。

在java中：

// use this to not add duplicates to list
static boolean contains (List<String[]> patterns, String[] pattern){
    for(String[] s: patterns)
        if (Arrays.equals(pattern,s)) return true;
    return false;
}


/**
 *
 * @param str String array containing all elements in your set
 * @param start index of subarray
 * @param end index of subarray
 * @return if subarray is a recurring pattern
 */
static boolean search (String[] str,int start,int end) {
    // length of pattern
    int len = end - start + 1;

    // how many times you want pattern to
    // appear in text
    int n = 1;

    // increment m if pattern is matched
    int m = 0;

    // shift pattern down the array
    for (int i = end+1; i <= str.length - len; i++) {
        int j;
        for (j = 0; j < len; j++) {
            if (!str[i + j].equals(str[start + j]))
                break;
        }

        // if pattern is matched at [i to i+len]
        if (j == len) {
            m++;
            if (m == n) return true;
        }
    }
    return false;
}


/**
 *
 * @param str String array containing all elements in your set
 * @return a list of subsets of input set which are a recurring pattern
 */
static List<String[]> g (String[] str) {
    // put patterns in here
    List<String[]> patterns = new ArrayList<>();

    // iterate through all possible subarrays in str
    for(int i = 0; i < str.length-1; i++){
        for(int j = i + 1; j < str.length; j++){

            // if a pattern is found
            if (search(str,i,j)) {
                int len = j-i+1;
                String[] subarray = new String[len];
                System.arraycopy(str,i,subarray,0,len);
                if (!contains(patterns,subarray))
                    patterns.add(subarray);

            }
        }
    }
    return patterns;
}

public static void main(String[] args) {

    String[] str = {"horse", "camel", "horse", "camel", "tiger",
                    "horse", "camel", "horse", "camel"};
    // print out
    List<String[]> patterns = g(str);
    for (String[] s: patterns)
        System.out.println(Arrays.toString(s));
}

输出：

[horse, camel]
[horse, camel, horse]
[horse, camel, horse, camel]
[camel, horse]
[camel, horse, camel]

正如我发布的评论中提到的那样：

“输出会包含[camel, horse]吗？”

由于[camel, horse]索引[1-2]和[6-7]有2个实例，因此我的输出结果如此。但也许我完全误解了你的问题而且我不理解这些限制。

至于优化，例如search(...)方法只是一个简单的子字符串搜索，有一些更优化的方法可以做到这一点，例如Knuth–Morris–Pratt。对不起，如果这正是你不想要的，但也许有一些用处

将重复/重复模式标识为父数组中的子数组

1 个答案: