Question

$cont=htmlspecialchars(file_get_contents("https://myanimelist.net/anime/30276/One_Punch_Man"));
function getBetween($string, $start = "", $end = ""){
    if (strpos($string, $start)) { // required if $start not exist in $string
        $startCharCount = strpos($string, $start) + strlen($start);
        $firstSubStr = substr($string, $startCharCount, strlen($string));
        $endCharCount = strpos($firstSubStr, $end);
        if ($endCharCount == 0) {
            $endCharCount = strlen($firstSubStr);
        }
        return substr($firstSubStr, 0, $endCharCount);
    } else {
        return '';
    }
}
$name=getBetween($cont,'title',' - MyAnimeList.net');
//$name=preg_replace('/[^a-zA-Z0-9 \p{L}]/m', '', $name);
preg_replace('/(*UTF8)[\>\<]/m', '', $name);
trim($name," ");
//$name=str_replace("gt", "", $name);
echo $name;

我想找到标题标签之间的文本。这该怎么做？例如在此页面标题中包含“我想得到一个打孔人-MyAnimeList.net”

Answer 1

只需使用字符串替换功能：

$string = '<BoomBox>';
$string = str_replace('<', '', $string);
$string = str_replace('>', '', $string);
echo $string; // output: Boombox

http://php.net/manual/en/function.str-replace.php

Answer 2

您已经编辑了答案，现在我们可以看到您正在处理XML / HTML。使用DOM类总是更好。切勿使用正则表达式！有一个著名的Stack Overflow帖子解释了为什么永远不使用正则表达式解析html。请尝试以下解决方案：

<?php

$dom = new DOMDocument();
$dom->loadHTML('<title>BoomBox</title>');
echo $dom->getElementsByTagName('title')->item(0)->textContent;

http://php.net/manual/en/class.domdocument.php

http://php.net/manual/en/class.domnode.php

看到它在这里https://3v4l.org/EjPQd

Answer 3

您可以使用preg_replace();或strip_tags();。

示例 preg_replace();：

$str = '> One Punch Man';

$new = preg_replace('/[^a-zA-Z0-9 \p{L}]/m', '', $str);
echo $new;

输出：一名拳手

以上示例仅允许使用a-z，A-Z和0-9。您可以展开它。

示例 strip_tags();：

$str = '<title> BoomBox </title>';

$another = strip_tags($str);
echo $another;

输出：BoomBox

文档：

http://php.net/manual/en/function.preg-replace.php // preg_replace（）;
http://php.net/manual/en/function.strip-tags.php // strip_tags（）;

Answer 4

您还可以将['<','>']作为 search 参数使用一次对str_replace的调用：

$string = '<BoomBox>';
echo str_replace(['<', '>'], '', $string) . PHP_EOL;
// => Boombox

或者，您可以将正则表达式与preg_replace一起使用（尤其是如果您打算为其添加上下文相关匹配的更多限制）：

echo preg_replace('~[<>]~', '', $string);
// => Boombox

请参见PHP demo。

如何从php中的字符串中删除“ <>”括号？

4 个答案: