SQL获取以%和计数表示的姓氏和名字字母行的出现

时间:2018-12-08 19:47:59

标签: mysql sql

我有一个数据库表,其中包含first_namelast_name列。我需要获取有关每个字母在first/last name中存在的次数以及百分比的统计信息。我对sql非常了解,有人可以帮助我实现这一目标吗?

例如,如果只有两列,并且名称分别为“ John Smith”和“ John Doe”,那么我将需要这样的内容:

statistics example

1 个答案:

答案 0 :(得分:0)

您只需跟踪总字符数和单个字符数。我使用ord()chr()ASCII和字符之间进行转换。

<?php
    //The position of the letter 'A' in the ASCII table
    // (start of the alphabet)
    $code_A = ord('A'); //65

    //Total number of character (all names)
    $totalCharacters = 0;
    //An array containing a number (count) for each number of the alphabet
    //[0 => 0, 1 => 0, 2 => 0, ..., 25 => 0]
    // where 0 represents A, 1 - B, 2 - C, etc.
    $characterCounts = []; //Here it is an empty array
    for($i = 0; $i < 26; $i++) //Here we loop from $i = 0 to $i = 25
        $characterCounts[$i] = 0; //An set the count of the current letter to 0
    //Prepared statement (A statement which has no arguments yet)
    //Make all characters uppercase (a becomes A, etc.)
    // so they have the same ASCII position
    // Now they are called "UPPER(first_name)" and "UPPER(last_name)"
    // so we rename them back to "first_name" and "last_name" using AS
    $pstmt = $mysqli->prepare('SELECT UPPER(first_name) AS first_name, UPPER(last_name) AS last_name FROM users');
    //We dont need to bind, since we dont have any arguments
    $pstmt->execute(); //Now we excecute the prepared statement
    $result = $pstmt->get_result(); //We get the result set...
    $rows = $result->fetch_all(MYSQLI_ASSOC); //...and convert it to an array (not necessary in this case)
    foreach($rows as $row) { //Loop over all rows
        //Get the length of first and last name...
        $firstNameLength = strlen($row['first_name']);
        $lastNameLength = strlen($row['last_name']);
        //...and add the number of characters to the total character count
        $totalCharacters += $firstNameLength + $lastNameLength;
        //Since PHP Strings work like arrays of characters, we can use a 
        // simple for loop to access their charaters (foreach doesnt work tho)
        for($i = 0; $i < $firstNameLength; $i++) {
            //Get the column 'first_name' of the row
            // get charater at position $i of the first name
            // get the ASCII position of that character
            // substract the start of the alphabet
            // to get a nice index starting at 0
            // aka the position in the alphabet
            $index = ord($row['first_name'][$i]) - $code_A;
            //If the index exists aka. is between 0 and 25
            // aka is alphabetical (A-Z)
            if(isset($characterCounts[$index]))
                ++$characterCounts[$index]; //Increse the counter for the charater by 1
        }
        //Do the same for the last name
        for($i = 0; $i < $lastNameLength; $i++) {
            $index = ord($row['last_name'][$i]) - $code_A;
            if(isset($characterCounts[$index]))
                ++$characterCounts[$index];
        }
    }
    //Output the data we gathered
    //Loop over each letter in our array / of the alphabet
    // to get its index in the array (and therefore the alphabet)
    // and also its value - the number of occurances
    foreach($characterCounts as $index => $characterCount) {
        //If it was found at least once...
        if($characterCount > 0) {
            //Output the character in at the position in the ASCII table
            // of the start of the alphabet plus its index in the alphabet
            //Add a space
            //Add the number of occurances
            //Add a space
            //Add the ratio of this character in all names times 100 to get the percentage
            //Add a percent sign
            echo(chr($code_A + $index).' '.$characterCount.' '.(100 * $characterCount / $totalCharacters).'%');
            echo('<br />');
        }
    }
?>

希望这会有所帮助。 -思维