Question

我使用file_get_contents（）从其他网站提取数据。

这是源代码的一部分：

<font style="font-size:10px;color:#123333;font-weight:BOLD;">1,22 €</font>

我使用split_on_title函数从字符串中拉出1,22€：

$split_on_title = preg_split("<font style=\"font-size:10px;color:#123333;font-weight:BOLD;\">", $source);
$split_on_endtitle = preg_split("</font>", $split_on_title[1]);
$title = $split_on_endtitle[0];

当我回显$ title时，firefox返回：

>1,22 â‚¬<

我在字符串上使用了preg_replace：

preg_replace('> â‚¬<', '', $title);

然后，php显示此错误：警告：preg_replace（）：没有结束分隔符'＆gt;'发现在....

我怎样才能提取1,22欧元的清洁价值？至少只有1,22。提前谢谢。

编辑：

理解我给出的数据很难。我会写一个更大的数据;

<tr>
    <td width="80" align="left" valign="top">
        <b> Price:</b>
    </td>
    <td align="left"  valign="top">
        <font style="font-size:10px;color:#123333;font-weight:BOLD;">1,22 €</font>
    </td>
</tr>

我需要帮助才能从这个来源获得1,22欧元。

Answer 1

请在html页面的<head>部分添加所需的UTF-8支持

<meta charset="UTF-8" />

缺少，因此欧元符号未正确呈现

有关如何输入此元标记和其他元标记的更多详细信息： http://www.w3schools.com/tags/tag_meta.asp

Answer 2

为什么不使用preg_match并抓住字体标记之间的所有内容？

$re = "/<font.*>(.*)<\\/font>/i"; 
$str = "<font style=\"font-size:10px;color:#123333;font-weight:BOLD;\">1,22 €</font>"; 

preg_match($re, $str, $matches);
echo $matches[1];

这里的模式是如何分解的。

<font matches the characters <font literally (case insensitive)
.* matches any character (except newline)
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
> matches the characters > literally
1st Capturing group (.*)
.* matches any character (except newline)
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
< matches the characters < literally
\/ matches the character / literally
font> matches the characters font> literally (case insensitive)
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])

Answer 3

@pavlovich的回答给了我> 1,22€<的输出。我用过;

$title = ltrim($title, '>'); 
$title = rtrim($title, '<');

删除标签。

我知道这不是正确的做法。但解决了我的问题。

使用file_get_contents（）从另一个网站提取的数据上的Split_on_title（）和preg_replace（）

3 个答案: