以下网址包含特定日期的制表符分隔数据表。下面的一个显然是2011年1月4日。
http://lpo.dt.navy.mil/data/DM/2011/2011_01_04/Air_Temp
我编写了一个脚本,该脚本将贯穿2010-06-01至今的整个日期范围,抓取日期和气温值并将其插入数据库。数据量不断导致它超时,但是由于一些奇怪的奇迹,我能够在2014-03-27之前获得所有内容。这是我使用的脚本(可能不是最优雅的解决方案,但我只是盯着看)。
public function requestdata(){
$year = 2014;
$startDate = new DateTime('2014-03-28');
$endDate = new DateTime('2014-05-05');
$incrementDay = new DateInterval('P1D');
while ($startDate <= $endDate)
{
$results = [];
$startDateFormat = $startDate->format('Y-m-d'); //Format DateTime object
$startDateUrl = str_replace( "-", "_", $startDateFormat); //Prep date for url entry
$dayOfYear = date("z", strtotime($startDateFormat)); //Store day of the year
if ($dayOfYear == '0') //If day of the year is 0, increment year by one
{
$year++;
}
$url = 'http://lpo.dt.navy.mil/data/DM/' . $year . '/' . $startDateUrl . '/Air_Temp';
$url_header = @get_headers($url);
//If URL is found, open URL, if not found set to empty
$file = ($url_header[0] == 'HTTP/1.1 200 OK') ? fopen($url, 'r') : "empty";
//If file is not found, skip the iteration of the loop, increment the day and return to top of loop
if ($file == "empty")
{
$startDate->add($incrementDay);
continue;
}
while (($line = fgetcsv($file, 500, "\t")) !== FALSE)
{
$dataArray = explode(" ", $line[0]);
$date = substr($dataArray[0], 0, 10);
$moddate = str_replace("_", "-", $date);
// var_dump($dataArray);
$results[$moddate]['air'][] = $dataArray[1];
}
fclose($file);
foreach ($results as $date => $subdata)
{
DB::table('weatherdata')->insert(array('date' => $date));
DB::table('weatherdata')->where('date', $date)->update(array('air_temp' => implode(",", $subdata['air'])));
}
$startDate->add($incrementDay);
之前我能够上传数据,但是现在它给了我一个未定义的偏移误差,我不知道为什么。我var_dumped $ dataArray,它会像这样转储每个数组:
array (size=2)
0 => string '2014_03_28 13:14:57' (length=19)
1 => string '38.4' (length=4)
array (size=2)
0 => string '2014_03_28 13:15:57' (length=19)
1 => string '38.4' (length=4)
...and so on and so forth
似乎所有内容都已定义,所以我不确定为什么我会收到此错误。任何帮助将不胜感激。