通过在PHP中使用CURL下载的某些图像已损坏

时间:2019-02-12 13:00:56

标签: php curl

我正在PHP中使用CURL从CSV文件的URL中下载图像,但是如果一行中有多个图像,则下载的某些图像会损坏,并且该图像的大小为0字节。 例:- 如果CSV文件是这样的,则第二个文件总是损坏。

Image 1, "https://d2qx4k6xgs9iri.cloudfront.net/ProductImages/ce363947-f23a-46d6-b106-1201cdca37f0.jpg, https://homepages.cae.wisc.edu/~ece533/images/airplane.png"

但是如果我删除了第一张或第二张图像,则无法成功保存该图像。示例:

Image 2, https://homepages.cae.wisc.edu/~ece533/images/arctichare.png

这是我的读取CSV文件的代码

    $file = fopen($file, "r");
    while (!feof($file)) {
        $data = fgetcsv($file);
        $images = $data[1];
        $images = explode(',', $images); //exploding images by ,
        foreach ($images as $image) {
            $milliseconds = md5(round(microtime(true) * 1000)) . '.jpg';
            $imagename = saveImage($image, $milliseconds);
        }
}

saveImage函数下面

function saveImage($url,$image_name){

echo $url.'<br/>'; //URL is correct and have image. I have checked it manually
$ch = curl_init ($url);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_BINARYTRANSFER,1);
    $raw=curl_exec($ch);
    curl_close ($ch);

    $fp = fopen('assets/products/large/' . $image_name,'x');
    fwrite($fp, $raw);
    fclose($fp);
}

我正在使用的CSV文件的确切示例

Images 1, "https://d2qx4k6xgs9iri.cloudfront.net/ProductImages/ce363947-f23a-46d6-b106-1201cdca37f0.jpg, https://homepages.cae.wisc.edu/~ece533/images/airplane.png"
Images 2, https://homepages.cae.wisc.edu/~ece533/images/arctichare.png
Images 3, "https://homepages.cae.wisc.edu/~ece533/images/fruits.png, https://homepages.cae.wisc.edu/~ece533/images/girl.png"
Images 4, "https://homepages.cae.wisc.edu/~ece533/images/goldhill.bmp, https://homepages.cae.wisc.edu/~ece533/images/tulips.png"

1 个答案:

答案 0 :(得分:2)

我认为问题很可能是协议之前的url中有空格-使用trim删除空格会有所帮助。我没有使用curl进行测试,而是使用file_get_contents并成功下载了所有文件。

$dir = 'c:/temp/downloads/';

$file=__DIR__ . DIRECTORY_SEPARATOR . 'img.csv';
$file=fopen( $file, 'r' );


while( !feof( $file ) ){
    $line = fgetcsv( $file );
    if( !empty( $line[1] ) ){
        $urls = explode( ',', $line[1] );

        foreach( $urls as $url ){
            $url=trim( $url );

            $bytes = file_put_contents( $dir . basename( $url ), file_get_contents( $url ) );
            printf('Saved %s - size: %sKb<br />',basename( $url ),$bytes / 1024 );
        }
    }
}
fclose( $file );

curl功能也需要进行一些调整-由于url通过SSL,因此您确实应该在curl请求中添加其他参数。我这样修改了函数:

function saveImage( $url, $image_path ){
    global $cacert;
    $fp = fopen( $image_path, 'w+' );

    $ch = curl_init( $url );
    curl_setopt($ch, CURLOPT_HEADER, 0 );
    curl_setopt($ch, CURLOPT_TIMEOUT, 10 );
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true );
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, true );
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true );
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, true );
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2 );
    curl_setopt($ch, CURLOPT_CAINFO, $cacert );
    curl_setopt($ch, CURLOPT_ENCODING, '' );
    curl_setopt($ch, CURLOPT_FILE, $fp );

    curl_exec($ch);
    curl_close ($ch);
    fclose($fp);
}

在其他地方定义了$cacert的地方,但实际上在我的系统上是c:\wwwroot\cacert.pem的地方,您可以从here - curl.haxx.se下载副本。

我运行这段代码而不是上面的代码:

while( !feof( $file ) ){
    $line = fgetcsv( $file );
    if( !empty( $line[1] ) ){
        $urls = explode( ',', $line[1] );

        foreach( $urls as $url ){
            $url=trim( $url );
            saveImage( $url, $dir . basename( $url ) );
        }
    }
}
fclose( $file );

screenshot showing downloaded images