Question

我需要在我的一个数据库字段中获取所有文件URL的列表。

mysql数据库，article表

`id` | `subject` | `content`

content的值是带有一个或多个文件网址的html文本，例如：

<p>this is the answer for ..., you can refer to below screenshot:</p>
<img src="http://the_url_of_image_here/imagename.jpg/>

<p>or refer to below document</p>

<a href="http://the_url_of_doc_here/guide.ppt>guide</a>
<a href="http://the_url_of_doc_here/sample.dox>sample</a>

有两种类型的文件

图片，扩展名为jpg，jpeg，png，bmp，gif
文档，扩展名为doc，docx，ppt，pptx，xls，xlsx，pdf，xps

我做了很多goolge，看起来很难只用mysql做，php会让它变得简单，我编写代码却无法正常工作。

谢谢cars10，我解决了它。

function export_articles_link()
{
    global $date_from, $date_to;
    $filename = "kb_articles_link_".$date_from."_".$date_to.".xlsx";
    header('Content-disposition: attachment;        filename="'.XLSXWriter::sanitize_filename($filename).'"');
    header("Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
    header('Content-Transfer-Encoding: binary');
    header('Cache-Control: must-revalidate');
    header('Pragma: public');
    $query = 'SELECT `content` FROM `kb_articles` WHERE ((DATE(`dt`) BETWEEN \'' . $date_from . '\' AND \'' . $date_to . '\') AND (`content` LIKE \'%<img src=%\' or `content` LIKE \'%<a href="http:%\')) order by id asc';
    $result = mysql_query($query);
    $writer = new XLSXWriter(); 
    $img_list = array();
    while ($row=mysql_fetch_array($result))
    {
        $text = $row['content'];
        preg_match_all('!http://.+\.(?:jpe?g|png|gif|ppt?|xls?|doc?|pdf|xdw)!Ui', $text, $matches);
        $img_list = $matches[0];
        foreach ($img_list as $url)
        {
        $writer->writeSheetRow('Sheet1', array($url)); // if more than one url it will be put on first column
        }
    };
    $writer->writeToStdOut();
    exit(0);
}

与需要工作样本的其他人分享，希望能节省您的时间。

Answer 1

您应该将中央循环更改为

$image_list=array(); // prepare an empty array for collection
while ($row=mysql_fetch_array($result))
{
    $text = $row['content'];
    preg_match_all('!http://.+?\.(?:jpe?g|png|gif|pptx?|xlsx?|docx?|pdf|xdw)!i', $s, $matches);
    $img_list=array_merge($image_list,$matches[0]);  // append to array       
}
$writer->writeSheetRow('Sheet1', $image_list);

由于你没有明确指出错误，我只是猜测并继续前进：正则表达式与原始表达式略有不同，也是我构建循环的方式（是的，只有需要一个）。 preg_match_all只需要为每个$text调用一次，然后将$matches[0]的结果合并到您的$img_list数组中。

我还删除了你的U - 修饰符，这反过来贪婪＆＃34;贪婪＆＃34;整个正则表达式。相反，我在?之后添加了一个+来制作这个量词＆＃34;非贪婪的＆＃34;。

我在这里准备了一个简约的演示：http://rextester.com/JDVMS87065

从mysql数据库字符串字段中提取所有文件链接URL到列表

1 个答案: