Question

当我将新图像提交到从pdf中提取的数据库时，它应该是原始图像的裁剪图像。如果图像已存在于数据库中，则不应插入图像，如果未插入图像，则必须为其生成标识值。

标识值也将插入与图像相同的表格中。

涉及钥匙。表格页面具有以下标识

PID，QID，pidentifierval，图像

$record = array('pid' => "NULL",
        'qid' => $qid,
        'pidentifierval' => $pid,
        'image' => $crop,
        'rotation' => 0);

function newquestionnaire（$ filename，$ desc =“”，$ type =“pngmono”）{

global $db;

if ($desc == "") $desc = $filename;

//generate temp file
$tmp = tempnam(TEMPORARY_DIRECTORY, "FORM");

//print "Creating PNG files<br/>";

//use ghostscript to convert to PNG
exec(GS_BIN . " -sDEVICE=$type -r300 -sOutputFile=\"$tmp\"%d.png -dNOPAUSE -dBATCH \"$filename\"");

//add to questionnaire table
//
//create form entry in DB
//

$db->StartTrans();

$sql = "INSERT INTO questionnaires (qid,description,sheets)
    VALUES (NULL,'$desc',0)";

$db->Execute($sql);

$qid = $db->Insert_Id();

//Number of imported pages
$pages = 0;

//read pages from 1 to n - stop when n does not exist
$n = 1;
$file = $tmp . $n . ".png";
while (file_exists($file))
{       
    $data = file_get_contents($file);
    $images = split_scanning($data);
    unset($image);
    unset($data);

    foreach($images as $data)
    {
        //header Cropped Function
       // Original image
        $filename = $data;
      // Get dimensions of the original image
       list($current_width, $current_height) = getimagesize($filename);
     // The x and y coordinates on the original image where we
     // will begin cropping the image
      $left = 50;
      $top = 50;
    // This will be the final size of the image (e.g. how many pixels
   // left and down we will be going)
   $crop_width = 200;
   $crop_height = 200;

     // Resample the image
   $canvas = imagecreatetruecolor($crop_width, $crop_height);
   $current_image = imagecreatefromjpeg($filename);
   imagecopy($canvas, $current_image, 0, 0, $left, $top, $current_width, $current_height);
   $crop = imagepng($canvas, $filename, 100);

//check for header Cropped Image

我应该在这里做什么功能？

        $pid = $pid.$pages;
        if ($pid)
        {
            $pages++;
                $record = defaultpage($qid,$pid,$crop);
                $db->AutoExecute('pages',$record,'INSERT'); 

        }
        else
            print T_("INVALID - IGNORING BLANK PAGE");

        unset($data);
        unset($crop);

我在这里很困惑，我应该如何检查和比较图像是否存在于数据库中。请帮忙

Answer 1

用于裁剪图像

我找到了解决方案

$data = '1.png';

list($current_width, $current_height) = getimagesize($data);

            // The x and y co-ordinates on the original image where we will begin cropping the image
            $left = 1100;

            $top = 30;

            // final size of image

            $crop_width = 200;

            $crop_height = 200;

            //Resample the Image

            $canvas = imagecreatefromtruecolor($crop_width, $crop_height);

            $current_image = imagecreatefrompng($data);

            imagecopy($canvas, $current_image, 0, 0, $left, $top, $current_width, $current_height);

            imagepng($canvas, $data, 100);

用于比较两个图像

array image_compare(mixed $image1, mixed $image2, int $RTolerance, int $GTolerance, int $BTolerance, int $WarningTolerance, int $ErrorTolerance)

参数此搜索 - 包含图像路径的字符串，或GD Library已创建的图像资源。这是第一个要比较的图像。图像2 - 包含第二个图像的路径的字符串，或者是已由GD库创建的第二个图像的资源。这将是第二个要比较的图像。耐受性，GTolerance，BTolerance（0-255） - 指定投掷标志前红色，绿色或蓝色（分别）通道的最大偏差。 WarningTolerance（0-100） - 警告返回前的通道差异百分比。 ErrorTolerance（0-100） - 返回错误（标志）之前的通道差异百分比。

返回值将返回一个包含以下信息的数组：

PixelsByColors - 像素数量* 3（对于R，G和B通道中的每一个）。 PixelsOutOfSpec - （如果像素在xTolerance之外变化，对于每个红色，绿色和蓝色。其中x = R / G / B）如果任何通道超过阈值，则该数字递增。 PercentDifference - 将PixelsOutOfSpec与PixelsByColors进行比较的差异百分比 WarningLevel和ErrorLevel - 如果百分比足够大，可以触发指定的警告或错误级别。

为了比较我必须工作，我想使用array.Using选择查询从数据库中获取图像，使用while循环并在数组中获取结果，调用数组的数组键以存储在变量中，并在上面的比较函数中使用if else条件。你的人怎么想？

Answer 2

对于图像比较，无扫描图像可以100％准确，因此图像比较并不那么容易。这是一项非常繁忙的任务。

经过大量研究和工作后，我发现如果100％需要进行图像比较。我将不得不使用Php-OpenCV库。如果我必须允许某种容错，上面的类可以正常工作。我的工作可以通过Php-tesseract实现。我只是使用了tesseract-OCR。我使用ghostscript将PDF转换为png，裁剪图像，使用放置在谷歌代码网站上的Php-tesseract OCR库将图像的特定部分转换为文本。将该文本称为变量，使用regexpression，我能够检查变量中是否存在特定文本，并在需要的条件下使用它。

将此视为问题的终点。

为了方便访问者，我粘贴了我的代码片段，以便可以使用它。

// Cropping the image
        // Get dimensions of the original image


        list($current_width, $current_height) = getimagesize($file);

       // The x and y co-ordinates on the original image where we will begin cropping the image

       $left = 1100;

       $top = 30;

       // final size of image

       $crop_width = 700;

       $crop_height = 200;

       //Resample the Image

       $canvas = imagecreatetruecolor($crop_width,$crop_height);

       $current_image = imagecreatefrompng($file);

       imagecopy($canvas, $current_image, 0, 0, $left, $top, $current_width, $current_height);

       imagepng($canvas, $file, 1);

        // Note you will have to install Php tesseract Library before making the API Call.

        $api= new TessBaseAPI;

        $api->Init(".","eng",$mode_or_oem=OEM_DEFAULT);

        $api->SetPageSegMode(PSM_AUTO);

        $mImgFile = $file;

        $handle=fopen($mImgFile,"rb");

        $mBuffer=fread($handle,filesize($mImgFile));

        //print strlen($mBuffer);

        $result=ProcessPagesBuffer($mBuffer,strlen($mBuffer)*4,$api);

        //print $result;

        $result = ProcessPagesFileStream($mImgFile,$api);

        //print "result(ProcessPagesFileStream)=";

        print $result;

        $txtchk = 'FORM';

        if(preg_match("/$txtchk/i", $result)) {

        echo true;

        }

我希望它会对很多人有所帮助。

图像裁剪和比较

2 个答案: