在字符串中查找主题标签和标记用户

时间:2018-03-29 02:43:35

标签: php

我正在开发一个项目,并尝试添加检测主题标签和标记用户的功能。

问题是我不知道当它到达符号或表情符号时会如何停止阅读(除了下划线)并且不要让长度超过20个字符

对于主题标签

  

#HelloWorld - > helloworld,#Hello_W0rld - > hello_w0rld,#Hello(世界 - >你好,

同样适用于已标记的用户(仅允许使用A-Z a-z 0-9和_

  

@HelloWorld - > helloworld,@ Hello_W0rl.d - > hello_w0rl

我的尝试代码是 (对于用户或主题标签基本相同)

$words = explode(" ", $body);

        foreach($words as $word){
            if(substr($word, 0, 1) == "@"){
                $tagged_user = DB::query('SELECT id FROM users WHERE username=:username', array(':username' => ltrim($word, '@')))[0];
                $users .= $tagged_user,",";
            }
        }

        $users = rtrim($users, ',');

也会知道不将#%保存为空格

编辑: 我把它更新到了这个,这是正确的吗?

$postid = "test_id";
        $matches = [];
        preg_replace_callback("/#([a-z_0-9]+)/i", function($res) use(&$matches) {
            $matches[] = strtolower($res[1]);
        }, $body);

        $matches2 = [];

        $tagholder = array_fill(0, count($matches), "?");
        $tagholderString = implode(", ", $tagholder);

        foreach($matches as $tagstring){
            if(DB::query('SELECT * FROM tags WHERE tag=:tag', array(':tag' => $tagstring))){
                $tag = DB::query('SELECT * FROM tags WHERE tag=:tag', array(':tag' => $tagstring))[0];
                DB::query ( "INSERT INTO post_tags VALUES(:tagid, :postid)", array (':tagid' => $tag['id'], ':postid' => $postid) );
            }else{
                $id = hash(sha256, $tagstring);
                DB::query ( "INSERT INTO tags VALUES(:id, :tag, :mode)", array (':id' => $id, ':tag' => $tagstring, ':mode' => 0) );
                DB::query ( "INSERT INTO post_tags VALUES(:tagid, :postid)", array (':tagid' => $id, ':postid' => $postid) );
            }
        }

        preg_replace_callback("/@([a-z_0-9]+)/i", function($res) use(&$matches2) {
            $matches2[] = strtolower($res[1]);
        }, $body);

        $userholder = array_fill(0, count($matches2), "?");
        $userholderString = implode(", ", $userholder);
        $user_query = DB::query("SELECT * FROM users WHERE username IN (".$userholderString.")", $matches2);

        $users_result = "";
        foreach($user_query as $result){
            $users_result .= $result['id'].",";
        }
        $users_result = rtrim($users_result, ',');

        //User string result
        $users_result;

2 个答案:

答案 0 :(得分:1)

您可以使用preg_replace_callback()将每个结果传递到strtolower()。您需要模式,每个要求都有一个模式。对于主题标签:

/#([a-z_0-9]+)/i

Demo

对于标签:

/@([a-z_0-9]+)/i

Demo

对于您要求开始@#的每一个,然后出现一个或多个字母,数字或下划线,不区分大小写。

结果代码如下所示:

$matches = [];
$string = "#HelloWorld -> helloworld, #Hello_W0rld -> hello_w0rld, #Hello(World -> hello,";

preg_replace_callback("/#([a-z_0-9]+)/i", function($res) use(&$matches) {
    $matches[] = strtolower($res[1]);
}, $string);

var_dump($matches);

$matches2 = [];
$string2 = "@HelloWorld -> helloworld, @Hello_W0rl.d -> hello_w0rl,";

preg_replace_callback("/@([a-z_0-9]+)/i", function($res) use(&$matches2) {
    $matches2[] = strtolower($res[1]);
}, $string2);

var_dump($matches2);

Demo

结果:

  

数组(大小= 3)
    0 =>字符串'helloworld'(长度= 10)
    1 =>字符串'hello_w0rld'(长度= 11)
    2 =>字符串'hello'(长度= 5)

     

数组(大小= 2)
    0 =>字符串'helloworld'(长度= 10)
    1 =>字符串'hello_w0rl'(长度= 10)

作为旁注,您不应该对找到的每个标记进行查询。这将迅速失控,并可能严重阻碍您的数据库性能。由于您拥有数组中的所有标记,因此只需使用WHERE IN子句进行一次查询,如下所示:

$placeholders = array_fill(0, count($matches), "?"); // get a ? for each match
$placeholdersString = implode(", ", $placeholders); // make it a string
DB::query("SELECT id FROM users WHERE username IN (".$placeholderString.")", $matches); // bind each value

答案 1 :(得分:0)

<?php

function hashtag($in) {

 preg_match_all('/#(\w+)/', $in, $found);

  foreach ($found[1] as $f) {
    $ht[] = $f;
  }

 return (array) $ht;
}

 function username($in) {

 preg_match_all('/@(\w+)/', $in, $found);

  foreach ($found[1] as $f) {
    $ht[] = $f;
  }

 return (array) $ht;
}

$string = "#hash1 #hash2 @user1 @user2 #hash3";
var_dump(hashtag($string));
var_dump(username($string));
?>

我刚写的两个函数,希望它有所帮助。使用正则表达式提取主题标签和用户名。