JavaScript模糊搜索

时间:2012-02-09 05:49:51

标签: javascript fuzzy-search

我正在处理这个过滤的事情,我有大约50-100个列表项。每个项目都有这样的标记:

<li>
  <input type="checkbox" name="services[]" value="service_id" />
  <span class="name">Restaurant in NY</span>
  <span class="filters"><!-- hidden area -->
    <span class="city">@city: new york</span>
    <span class="region">@reg: ny</span>
    <span class="date">@start: 02/05/2012</span>
    <span class="price">@price: 100</span>
  </span>
</li>

我创建了这样的标记,因为我最初使用了List.js

所以,可能你已经猜到了,我想要做的是这样的搜索:@region: LA @price: 124等等。问题是我还想显示多个项目,以便选择多于......一个:)

我认为这需要模糊搜索,但问题是我找不到任何功能。

任何想法或出发点?

//编辑:因为我的项目数量相当少,我想要一个客户端解决方案。

7 个答案:

答案 0 :(得分:26)

我在javascript中寻找“模糊搜索”但是在这里找不到解决方案,所以我编写了自己的功能来完成我需要的工作。

算法很简单:循环通过针脚字母并检查它们是否在大海捞针中以相同的顺序出现:

String.prototype.fuzzy = function (s) {
    var hay = this.toLowerCase(), i = 0, n = -1, l;
    s = s.toLowerCase();
    for (; l = s[i++] ;) if (!~(n = hay.indexOf(l, n + 1))) return false;
    return true;
};

e.g:

('a haystack with a needle').fuzzy('hay sucks');    // false
('a haystack with a needle').fuzzy('sack hand');    // true

答案 1 :(得分:6)

一年后,List.js为fuzzy search提供了一个很好的插件,效果非常好。

答案 2 :(得分:2)

另一个(简单)解决方案。不区分大小写并忽略字母的顺序。

它会检查搜索词的每个字母。如果原始字符串包含该字母,则它将向上计数(如果不包含则向下计数)。根据matches / string-length的比例,它将返回true或false。

String.prototype.fuzzy = function(term, ratio) {
    var string = this.toLowerCase();
    var compare = term.toLowerCase();
    var matches = 0;
    if (string.indexOf(compare) > -1) return true; // covers basic partial matches
    for (var i = 0; i < compare.length; i++) {
        string.indexOf(compare[i]) > -1 ? matches += 1 : matches -=1;
    }
    return (matches/this.length >= ratio || term == "")
};

示例:

("Test").fuzzy("st", 0.5) // returns true
("Test").fuzzy("tes", 0.8) // returns false cause ratio is too low (0.75)
("Test").fuzzy("stet", 1) // returns true
("Test").fuzzy("zzzzzest", 0.75) // returns false cause too many alien characters ("z")
("Test").fuzzy("es", 1) // returns true cause partial match (despite ratio being only 0.5)

答案 3 :(得分:1)

我有一个小功能,搜索数组中的字符串 (至少对我而言,它比levenshtein产生更好的结果):

function fuzzy(item,arr) {
  function oc(a) {
    var o = {}; for (var i=0; i<a.length; i++) o[a[i]] = ""; return o;
  }
  var test = [];
  for (var n=1; n<=item.length; n++)
    test.push(item.substr(0,n) + "*" + item.substr(n+1,item.length-n));
  var result = [];
  for (var r=0; r<test.length; r++) for (var i=0; i<arr.length; i++) {
    if (arr[i].toLowerCase().indexOf(test[r].toLowerCase().split("*")[0]) != -1)
    if (arr[i].toLowerCase().indexOf(test[r].toLowerCase().split("*")[1]) != -1)
    if (0 < arr[i].toLowerCase().indexOf(test[r].toLowerCase().split("*")[1]) 
          - arr[i].toLowerCase().indexOf(test[r].toLowerCase().split("*")[0] < 2 ) )
    if (!(arr[i] in oc(result)))  result.push(arr[i]);
  }
  return result;
}

答案 4 :(得分:1)

我自己做了。它使用,更像是概念验证,因为它完全没有经过压力测试。

享受 javascript模糊搜索/模糊匹配 http://unamatasanatarai.github.io/FuzzyMatch/test/index.html

答案 5 :(得分:1)

我对list.js并不满意,所以我创建了自己的。这可能不是模糊搜索,但我不知道该怎么称呼它。我只是想让它匹配一个查询,而不考虑我在查询中的单词顺序。

考虑以下情况:

  • 内存中的文章集合
  • 查询单词外观的顺序并不重要(例如&#34; hello world&#34; vs&#34; world hello&#34;)
  • 代码应易于阅读

以下是一个例子:

var articles = [{
  title: '2014 Javascript MVC Frameworks Comparison',
  author: 'Guybrush Treepwood'
}, {
  title: 'Javascript in the year 2014',
  author: 'Herman Toothrot'
},
{
  title: 'Javascript in the year 2013',
  author: 'Rapp Scallion'
}];

var fuzzy = function(items, key) {
  // Returns a method that you can use to create your own reusable fuzzy search.

  return function(query) {
    var words  = query.toLowerCase().split(' ');

    return items.filter(function(item) {
      var normalizedTerm = item[key].toLowerCase();

      return words.every(function(word) {
        return (normalizedTerm.indexOf(word) > -1);
      });
    });
  };
};


var searchByTitle = fuzzy(articles, 'title');

searchByTitle('javascript 2014') // returns the 1st and 2nd items

好吧,我希望这可以帮助那些人。

答案 6 :(得分:0)

此处提供的解决方案将返回true/false,并且没有有关匹配哪个部分和不匹配哪个部分的信息。

在某些情况下,您可能需要了解它。在搜索结果中将部分输入内容加粗

我已经在打字稿中创建了自己的解决方案(如果您想使用它-我已在此处发布它-https://github.com/pie6k/fuzzystring)并在此处https://pie6k.github.io/fuzzystring/进行演示

它的工作原理是:

fuzzyString('liolor', 'lorem ipsum dolor sit');

// returns
{
  parts: [
    { content: 'l', type: 'input' },
    { content: 'orem ', type: 'fuzzy' },
    { content: 'i', type: 'input' },
    { content: 'psum d', type: 'fuzzy' },
    { content: 'olor', type: 'input' },
    { content: ' sit', type: 'suggestion' },
  ],
  score: 0.87,
}

这是完整的实现(打字稿)

type MatchRoleType = 'input' | 'fuzzy' | 'suggestion';

interface FuzzyMatchPart {
  content: string;
  type: MatchRoleType;
}

interface FuzzyMatchData {
  parts: FuzzyMatchPart[];
  score: number;
}

interface FuzzyMatchOptions {
  truncateTooLongInput?: boolean;
  isCaseSesitive?: boolean;
}

function calculateFuzzyMatchPartsScore(fuzzyMatchParts: FuzzyMatchPart[]) {
  const getRoleLength = (role: MatchRoleType) =>
    fuzzyMatchParts
      .filter((part) => part.type === role)
      .map((part) => part.content)
      .join('').length;

  const fullLength = fuzzyMatchParts.map((part) => part.content).join('')
    .length;
  const fuzzyLength = getRoleLength('fuzzy');
  const inputLength = getRoleLength('input');
  const suggestionLength = getRoleLength('suggestion');

  return (
    (inputLength + fuzzyLength * 0.7 + suggestionLength * 0.9) / fullLength
  );
}

function compareLetters(a: string, b: string, isCaseSensitive = false) {
  if (isCaseSensitive) {
    return a === b;
  }
  return a.toLowerCase() === b.toLowerCase();
}

function fuzzyString(
  input: string,
  stringToBeFound: string,
  { truncateTooLongInput, isCaseSesitive }: FuzzyMatchOptions = {},
): FuzzyMatchData | false {
  // make some validation first

  // if input is longer than string to find, and we dont truncate it - it's incorrect
  if (input.length > stringToBeFound.length && !truncateTooLongInput) {
    return false;
  }

  // if truncate is enabled - do it
  if (input.length > stringToBeFound.length && truncateTooLongInput) {
    input = input.substr(0, stringToBeFound.length);
  }

  // if input is the same as string to be found - we dont need to look for fuzzy match - return it as match
  if (input === stringToBeFound) {
    return {
      parts: [{ content: input, type: 'input' }],
      score: 1,
    };
  }

  const matchParts: FuzzyMatchPart[] = [];

  const remainingInputLetters = input.split('');

  // let's create letters buffers
  // it's because we'll perform matching letter by letter, but if we have few letters matching or not matching in the row
  // we want to add them together as part of match
  let ommitedLettersBuffer: string[] = [];
  let matchedLettersBuffer: string[] = [];

  // helper functions to clear the buffers and add them to match
  function addOmmitedLettersAsFuzzy() {
    if (ommitedLettersBuffer.length > 0) {
      matchParts.push({
        content: ommitedLettersBuffer.join(''),
        type: 'fuzzy',
      });
      ommitedLettersBuffer = [];
    }
  }

  function addMatchedLettersAsInput() {
    if (matchedLettersBuffer.length > 0) {
      matchParts.push({
        content: matchedLettersBuffer.join(''),
        type: 'input',
      });
      matchedLettersBuffer = [];
    }
  }

  for (let anotherStringToBeFoundLetter of stringToBeFound) {
    const inputLetterToMatch = remainingInputLetters[0];

    // no more input - finish fuzzy matching
    if (!inputLetterToMatch) {
      break;
    }

    const isMatching = compareLetters(
      anotherStringToBeFoundLetter,
      inputLetterToMatch,
      isCaseSesitive,
    );

    // if input letter doesnt match - we'll go to the next letter to try again
    if (!isMatching) {
      // add this letter to buffer of ommited letters
      ommitedLettersBuffer.push(anotherStringToBeFoundLetter);
      // in case we had something in matched letters buffer - clear it as matching letters run ended
      addMatchedLettersAsInput();
      // go to the next input letter
      continue;
    }

    // we have input letter matching!

    // remove it from remaining input letters
    remainingInputLetters.shift();

    // add it to matched letters buffer
    matchedLettersBuffer.push(anotherStringToBeFoundLetter);
    // in case we had something in ommited letters buffer - add it to the match now
    addOmmitedLettersAsFuzzy();

    // if there is no more letters in input - add this matched letter to match too
    if (!remainingInputLetters.length) {
      addMatchedLettersAsInput();
    }
  }

  // if we still have letters left in input - means not all input was included in string to find - input was incorrect
  if (remainingInputLetters.length > 0) {
    return false;
  }

  // lets get entire matched part (from start to last letter of input)
  const matchedPart = matchParts.map((match) => match.content).join('');

  // get remaining part of string to be found
  const suggestionPart = stringToBeFound.replace(matchedPart, '');

  // if we have remaining part - add it as suggestion
  if (suggestionPart) {
    matchParts.push({ content: suggestionPart, type: 'suggestion' });
  }
  const score = calculateFuzzyMatchPartsScore(matchParts);

  return {
    score,
    parts: matchParts,
  };
}