RecordLinkage包和RLBigDataLinkage-Class对象

时间:2013-10-30 20:55:32

标签: r

我正在尝试使用R包RecordLinkage,除了包two之外,我还使用包作者的articles documentation作为使用指南。

我正在使用2个大型数据集(100k +行),我希望将其链接起来,因此我使用的是围绕S4类RLBigDataLinkage构建的包的那些元素。

首先在R中运行以下行:

>library('RecordLinkage')
>data1 <- as.data.frame(#source)
>data2 <- as.data.frame(#source)
>rpairs <- RLBigDataLinkage(data1, data2, strcmp = 2:8, exclude = 9:10)

这很好(虽然需要一些时间),并编写必要的.ff文件来处理大型数据集。

如果我再尝试:

>rpairs <- epiWeights(rpairs)

或者:

>rpairs <- epiWeights(rpairs, e = 0.01, f = getFrequencies(rpairs))

然后当我跑:

>summary(rpairs)

我收到错误消息:

Error in dbGetQuery(object@con, "select count(*) from data1") : 
    error in evaluating the argument 'conn' in selecting a method for function 'dbGetQuery': Error: no slot of name "con" for this object of class "RLBigDataLinkage"

另一方面,如果我跑:

>result <- epiClassify(rpairs, 0.5)
>getTable(result)

我收到错误消息:

Error in table.ff(object@data@pairs$is_match, object@prediction, useNA = "ifany") : 
     Only vmodes integer currently allowed - are you sure ... contains only factors or integers?

我显然遗漏了一些关于如何处理这些物品的事情。有没有人有这个包看到我的错误的经验?谢天谢地。

1 个答案:

答案 0 :(得分:0)

当'rpairs'的类型为'RLBigDataLinkage'时使用<?php ob_start(); $host="localhost"; // Host name $username="root"; // Mysql username $password="****"; // Mysql password $db_name="Username"; // Database name $tbl_name="Name"; // Table name // Connect to server and select databse. $conn = mysqli_connect($host, $username, $password, $db_name) or die(mysqli_error($conn)); if (isset($_POST['submit'])) { // Define $username $username = $_POST['user_name']; $sql = "SELECT * FROM $tbl_name WHERE Name='$username'"; $result = mysqli_query($conn, $sql); // Mysql_num_row is counting table row $count = mysqli_num_rows($result); if ($count > 0 ) { // get data from user $data = mysqli_fetch_array($result, MYSQLI_ASSOC); require 'PHPMailer-master/PHPMailerAutoload.php'; $mail = new PHPMailer; $mail->IsSMTP(); // telling the class to use SMTP $mail->Host = "****"; // SMTP server // enables SMTP debug information $mail->SMTPAutoTLS = false; $mail->SMTPSecure = false; $mail->SMTPAuth = true; // enable SMTP authentication $mail->Host = "****"; // sets the SMTP server $mail->Port = 587; // set the SMTP port for the GMAIL server $mail->Username = "****"; // SMTP account username $mail->Password = "****"; // SMTP account password $mail->From = "Test"; $mail->FromName = "Test"; $mail->AddAddress($username, ""); $mail->isHTML(true); $mail->Subject = 'Your Company Details'; $mail->Body = "Your company details are: Name: = " . $data['Name'] . ", Surname: " . $data['Surname'] . ", Cellphone: " . $data['Cellphone']; if(!$mail->Send()) { echo 'Message could not be sent.'; echo 'Mailer Error: ' . $mail->ErrorInfo; exit(); } else { echo 'Email Sent Successfully!'; } } ob_end_flush(); ?> ,您将获得rpairs的摘要。