我正在使用alchemy api来过滤掉一些数据,但这是我面临的一些奇怪的事情。我将一个字符串存储在变量中并将其传递给alchemy api。
$dummytext = "Hello john Everything works here";
$file = $request->file('resume');
$extension = $file->getClientOriginalExtension();
$contents = \File::get($file);
$parser = new Parser;
$pdf = $parser->parseFile($file);
$text = $pdf->getText();
//echo $text;
$email_pattern = "/(?:[A-Za-z0-9!#$%&'*+=?^_`{|}~-]+(?:\.[A-Za-z0-9!#$%&'*+=?^_`{|}~-]+)*|\"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*\")@(?:(?:[A-Za-z0-9](?:[A-Za-z0-9-]*[A-Za-z0-9])?\.)+[A-Za-z0-9](?:[A-Za-z0-9-]*[A-Za-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[A-Za-z0-9-]*[A-Za-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])/";
$user_email = preg_match_all($email_pattern, $text, $matches);
foreach ($matches as $match ) {
foreach ($match as $email ) {
$user_email = $email;
}
}
$phone_pattern = '/[(]*\d{3}[)]*\s*[.\-\s]*\d{3}[.\-\s]*\d{4}/';
$user_phone = preg_match_all($phone_pattern, $text, $matches);
foreach ($matches as $match ) {
foreach ($match as $phone) {
$user_phone = $phone;
}
}
echo "The user-email is : ".$user_email."<br>";
echo "The user-Phone is : ".$user_phone."<br>";
echo $text;
$response = $alchemyapi->entities('text',$dummytext, null);
if ($response['status'] == 'OK') {
echo '## Entities ##', PHP_EOL."<br>";
foreach ($response['entities'] as $entity) {
if ($entity['type'] == "Person") {
echo $entity['text'];
}
}
echo PHP_EOL;
}
else {
echo 'Error in the taxonomy call: ', $response['statusInfo'];
}
在上面一行中,当我echo $text
时,它会打印从简历转换的文本。问题出在这里
这是我将文本传递给alchemy api的地方
$response = $alchemyapi->entities('text',$dummytext, null);
这是奇怪的事情。当我通过$dummytext
时它工作正常。
但是当我通过$text
$response = $alchemyapi->entities('text',$text, null);
它什么都不返回。另一个问题是如果我存储我从简历转换的文本并传递给api。
$resume_text = "This is text i received from the resume".
以下代码运行正常。
$response = $alchemyapi->entities('text',$resume_text, null);
但$text
变量与preg_match_all()
中用于提取电子邮件和电话的变量相同。它在那里运作完美。我不知道是什么导致了这个问题。