如何从网站上截取日期并使用php和mysql将该日期存储在数据库中?

时间:2018-03-24 20:52:41

标签: php mysql database curl web-scraping

我一直在互联网上寻找获取事件日期的方法,然后将该日期存储到数据库中,但我找不到多少。

我能够从网站上获取日期,但我不知道如何存储它。

我想只从网站上获取日期,然后以Y-m-d的格式存储。如果您知道任何方法,请告诉我。

链接:https://www.brent.gov.uk/events-and-whats-on-calendar/?eventCat=Wembley+Stadium+events

<?php

$curl = curl_init(); 
$all_data = array();

$url = "https://www.brent.gov.uk/events-and-whats-on-calendar/?eventCat=Wembley+Stadium+events";

curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

$result = curl_exec($curl);

$event = array();

preg_match_all('/<h3 style="margin:0px!important;">(.*?)<\/h3>/si',$result,$match);
$event['title'] = $match[1];

print_r($event['title']);

echo $all_data;
?>

2 个答案:

答案 0 :(得分:1)

don't use regex to parse html, use a proper HTML parser,例如DOMDocument。

对网站的快速检查显示,所有日期都在页面上唯一h3元素的article个孩子中,您可以使用它来识别它们。提取日期后,您可以使用strtotime()将其转换为unix时间戳,然后您可以使用date()将其转换为Y-m-d格式,例如

$result = curl_exec($curl);
$domd=@DOMDocument::loadHTML($result);
$dateElements=$domd->getElementsByTagName("article")->item(0)->getElementsByTagName("h3");
foreach($dateElements as $ele){
    var_dump(date("Y-m-d",strtotime($ele->textContent)));
}

关于如何将结果存储在mysql数据库中,尝试在google中编写php mysql tutorial -w3schools,或者在此处阅读PDO部分:http://www.phptherightway.com/#pdo_extension

答案 1 :(得分:0)

        <?php

        $db_host = "localhost"; 
        $db_username = "username"; 
        $db_pass = "password"; 
        $db_name = "name"; 

        // Run the actual connection here 
        $con = mysqli_connect($db_host, $db_username, $db_pass, $db_name);
        if ($con->connect_errno) {
            die("Failed to connect to MySQL: (" . $con->connect_errno . ") " . $con->connect_error);
        }

        $curl = curl_init();

    //The Website you want to get data from
        $url = "https://www.brent.gov.uk/events-and-whats-on-calendar/?eventCat=Wembley+Stadium+events";

        curl_setopt($curl, CURLOPT_URL, $url);
        curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

        $result = curl_exec($curl);
        libxml_use_internal_errors(true);

        $domd=@DOMDocument::loadHTML($result);

        //Getting the date from the site

        $dateElements=$domd->getElementsByTagName("article")->item(0)->getElementsByTagName("h3");
        foreach($dateElements as $ele){
            $data = (date("Y-m-d",strtotime($ele->textContent)));

        // echo "<br>".$data;

//checking if the date match with database date
           $sql = "SELECT * FROM event_table WHERE date = '$data'";
            $result = $con->query($sql);

        if ($result->num_rows > 0) {

            // output data of each row, if date match echo "Data is there";
            while($row = $result->fetch_assoc()) {
                  echo  "Data is there";
            }
        } 
//if date is not there then inster it into the database
        else {
           $results = mysqli_query($con, "INSERT INTO event_table (id, date) VALUES ('',' $data')");
            echo "data uploaded";
        }

        }
        ?>