排序数组 - 非英文字母+双字母PHP

时间:2016-10-30 14:58:41

标签: php arrays string sorting

我想按字母顺序对一组单词进行排序。不幸的是,用我的语言(克罗地亚语),有双字符字母(例如lj,nj,dž),以及没有用php sort函数正确排序的字母(例如č,ć,ž,š,đ )。

这是正确订购的克罗地亚字母(还有一些英文字母):

$alphabet = array(
            'a', 'b', 'c',
            'č', 'ć', 'd',
            'dž', 'đ', 'e',
            'f', 'g', 'h',
            'i', 'j', 'k',
            'l', 'lj', 'm',
            'n', 'nj', 'o',
            'p', 'q', 'r',
            's', 'š', 't',
            'u', 'v', 'w',
            'x', 'y', 'z', 'ž'
          );

这是一个单词列表,也是正确排序的:

$words = array(
            'alfa', 'beta', 'car', 'čvarci', 'ćup', 'drvo', 'džem', 'đak', 'endem', 'fićo', 'grah', 'hrana', 'idealan', 'jabuka', 'koza', 'lijep', 'ljestve', 'mango',
            'nebo', 'njezin', 'obrva', 'pivnica', 'qwerty', 'riba', 'sir', 'šaran', 'tikva', 'umanjenica', 'večera', 'wind', 'x-ray', 'yellow', 'zakaj', 'žena'
          );

我正在考虑对它进行排序的方法。一种方法是将每个单词分成字母。由于多字符字母我不知道该怎么做,我问了一个问题并得到了一个很好的答案来解决这个问题(see here)。所以我循环遍历数组,并使用最佳回答者提供的代码将每个单词分成字母。 当数组循环时,我有一个新数组(让我们将它命名为$words_splitted)。该数组的元素也是数组,每个都代表一个单词。

Array
(
    [0] => Array
        (
            [0] => a
            [1] => l
            [2] => f
            [3] => a
        )

    [1] => Array
        (
            [0] => b
            [1] => e
            [2] => t
            [3] => a
        )

    [2] => Array
        (
            [0] => c
            [1] => a
            [2] => r
        )...
 ...[16] => Array
        (
            [0] => lj
            [1] => e
            [2] => s
            [3] => t
            [4] => v
            [5] => e
        )

我们的想法是将每个数组的每个字母与$alphabet变量的索引值进行比较。例如,$words_splitted[0][0]将与$words_splitted[1][0]进行比较,然后与$words_splitted[2][0]等进行比较。如果我们比较字母'a'和'b',则字母'a'的索引编号较小$alphabet变量,因此它出现在'b'之前。

不幸的是,我卡住了......我不知道该怎么做。有什么想法吗?

注意:不应使用PHP扩展程序。

2 个答案:

答案 0 :(得分:0)

这是一个类,可以帮助您根据特定的字母字符表对字符串数组进行排序:

<?php

/**
 * This class can be used to compare unicode strings.
 * It can be used for easy array sorting.
 * 
 * You can set your own alphabet characters table to be used.
 */
class UnicodeStringComperator {
    private $alphabet = [];

    public function __construct() {
        // We set the default alphabet characters table to a-z.
        $this->alphabet = range('a', 'z');
    }

    /**
     * Set the characters table to use for sorting
     * 
     * @param array $alphabet The characters table for the sorting
     */
    public function setAlphabet($alphabet) {
        $this->alphabet = $alphabet;
    }

    /**
     * Split the string into an array of the characters
     * 
     * @param string $str The string to split
     * @return array The array of the characters characters in the string
     */
    public function splitter($str){
        return preg_split('//u', $str, -1, PREG_SPLIT_NO_EMPTY);
    }

    /**
     * Find the place of the char in the alphabet table
     * 
     * @param string $chr The character to find
     * @return mixed the place of the char in the table or NULL if not found
     */
    public function place($chr) {
        return array_search($chr, $this->alphabet);
    }

    /**
     * Do the comparison between the 2 strings
     * 
     * @param string $str1 The first
     * @param string $str2 The first
     * @return int The values -1, 0, 1 if $str1 < $str2, $str1 == $str2 or $str1 > $str2 accordingly
     */
    public function compare($str1, $str2) {
        $chars1 = $this->splitter($str1);
        $chars2 = $this->splitter($str2);
        for ($i = 0; $i < count($chars1) && $i < count($chars2); $i++) {
            $p1 = $this->place($chars1[$i]);
            $p2 = $this->place($chars2[$i]);
            if ($p1 < $p2) {
                return -1;
            } elseif ($p1 > $p2) {
                return 1;
            }
        }
        if (count($chars1) <= count($chars2)) {
            return -1;
        }
        return 0;
    }

    /**
     * Sort an array of strings based on the alphabet table
     * 
     * @param Array $ar The array of strings to sort
     * @return Array The sorted array.
     */
    public function sort_array($ar) {
        usort($ar, array('self', 'compare'));
        return $ar;
    }
}

要与您的特定字母一起使用,您可以使用setAlphabet功能配置您自己的字符排序表:

<?php
$alphabet = array(
            'a', 'b', 'c',
            'č', 'ć', 'd',
            'dž', 'đ', 'e',
            'f', 'g', 'h',
            'i', 'j', 'k',
            'l', 'lj', 'm',
            'n', 'nj', 'o',
            'p', 'q', 'r',
            's', 'š', 't',
            'u', 'v', 'w',
            'x', 'y', 'z', 'ž'
    );
$comperator = new UnicodeStringComperator();
$comperator->setAlphabet($alphabet);
$sorted_words = $comperator->sort_array($words);
var_dump($sorted_words);

输出是原始数组:

array(34) {
  [0] =>
  string(4) "alfa"
  [1] =>
  string(4) "beta"
  [2] =>
  string(3) "car"
  [3] =>
  string(7) "čvarci"
  [4] =>
  string(4) "ćup"
  [5] =>
  string(4) "drvo"
  [6] =>
  string(5) "džem"
  [7] =>
  string(4) "đak"
  [8] =>
  string(5) "endem"
  [9] =>
  string(5) "fićo"
  [10] =>
  string(4) "grah"
  [11] =>
  string(5) "hrana"
  [12] =>
  string(7) "idealan"
  [13] =>
  string(6) "jabuka"
  [14] =>
  string(4) "koza"
  [15] =>
  string(5) "lijep"
  [16] =>
  string(7) "ljestve"
  [17] =>
  string(5) "mango"
  [18] =>
  string(4) "nebo"
  [19] =>
  string(6) "njezin"
  [20] =>
  string(5) "obrva"
  [21] =>
  string(7) "pivnica"
  [22] =>
  string(6) "qwerty"
  [23] =>
  string(4) "riba"
  [24] =>
  string(3) "sir"
  [25] =>
  string(6) "šaran"
  [26] =>
  string(5) "tikva"
  [27] =>
  string(10) "umanjenica"
  [28] =>
  string(7) "večera"
  [29] =>
  string(4) "wind"
  [30] =>
  string(5) "x-ray"
  [31] =>
  string(6) "yellow"
  [32] =>
  string(5) "zakaj"
  [33] =>
  string(5) "žena"
}

答案 1 :(得分:0)

你可以试试Collat​​or。

$words = array( 'alfa', 'beta', 'car', 'čvarci', 'ćup', 'drvo', 'džem', 'đak', 'endem', 'fićo', 'grah', 'hrana', 'idealan', 'jabuka', 'koza', 'lijep', 'ljestve', 'mango', 'nebo', 'njezin', 'obrva', 'pivnica', 'qwerty', 'riba', 'sir', 'šaran', 'tikva', 'umanjenica', 'večera', 'wind', 'x-ray', 'yellow', 'zakaj', 'žena' );
$collator = new Collator('hr_HR');
// or $collator = new Collator( 'hr' );
$collator->sort($words);
print_r($words);

我不确定克罗地亚语的语言环境代码是什么,你应该看看there。 该代码基于对类似问题there的回复。