R:将数据帧的元素与邻居相乘

时间:2015-05-17 15:28:31

标签: r

我有一个300x300元素的数据框。它们中的每一个都是-1或+1:

     [,1]   [,2]   [,3]  
[1,]   1     -1     -1   
[2,]   1      1      1  
[3,]  -1     -1      1  
[4,]   1      1     -1

我想要的是迭代我的数据框,并将每个值与每个相邻值相乘 即:
对于原始数据框中的元素[1,1],我想要[1,1],[1,2]和[2,1]的乘积 对于我原始数据框中的元素[2,2],我想要[2,2],[1,2],[2,1],[2,3]和[3,2]的乘积。

我试图创建4个新数据帧,每个数据帧分别向右,向左,向上和向下移动1个元素:

x_up <- shift(x, 1, dir='up')
x_up <- as.array(x_up)
dim(x_up) <- dims
x_down <- shift(x, 1, dir='down')
x_down <- as.array(x_down)
dim(x_down) <- dims
x_left <- shift(x, 1, dir='left')
x_left <- as.array(x_left)
dim(x_left) <- dims
x_right <- shift(x, 1, dir='right')
x_right <- as.array(x_right)
dim(x_right) <- dims

其中x是我的原始数据框 我可以看到,当我使用这种方法时,新的数据帧不是正确的;其中更多是相同的。我用相同的()检查了这个。

我的问题有另一种方法吗?

编辑:
shift()属于&#39; binhf&#39;文库

2 个答案:

答案 0 :(得分:4)

我认为这可能是一种更聪明的方法,但标准方法是迭代每个元素并使其周围环境倍增。

从:

开始
<?php

// ***** MagicQuoteFix ***** //

if (get_magic_quotes_gpc())  
{  
  function stripslashes_deep($value)  
  {  
    $value = is_array($value) ?  
        array_map('stripslashes_deep', $value) :  
        stripslashes($value);  
    return $value;  
  }  
  $_POST = array_map('stripslashes_deep', $_POST);  
  $_GET = array_map('stripslashes_deep', $_GET);  
  $_COOKIE = array_map('stripslashes_deep', $_COOKIE);  
  $_REQUEST = array_map('stripslashes_deep', $_REQUEST);  
} 

// ***** Begin Connection Info ***** //

$connection = mysqli_connect('localhost', 'ijdbuser', 'ijdbpw');

if (!$connection)
{
    $error = 'Unable to connect to the database server.';
    include 'error.html.php';
    exit();
}

if (!mysqli_set_charset($connection, 'utf8'))
{
    $output = 'Unable to set database connection encoding.';
    include 'output.html.php';
    exit();
}

if (!mysqli_select_db($connection, 'ijdb'))
{
    $error = 'Unable to locate the joke database.';
    include 'error.html.php';
    exit();
}

// ***** Display DB ***** //

$result = mysqli_query($connection, 'SELECT id, joketext FROM joke');

if (!$result)
{
    $error = 'Error fetching jokes: ' . mysqli_error($connection);
    include 'error.html.php';
    exit();
}

while ($row = mysqli_fetch_array($result))
{
    $jokes[] = array('id' => $row['id'], 'text' => $row['joketext']);
}

if (isset($_GET['addjoke'])) {}

else
{
    include 'jokes.html.php';
}

// 


// ***** Begin Add/Remove DB Options ***** //

if (isset($_GET['addjoke']))
{
    include 'form.html.php';
    exit();
}

if (isset($_GET['deletejoke']))
{
    $id = mysqli_real_escape_string($connection, $_POST['id']);

    $sql = "DELETE FROM joke WHERE id='$id'";
    if (!mysqli_query($connection, $sql))
    {
        $error = 'Error deleting joke: ' . mysqli_error($connection);
        include 'error.html.php';
        exit();
    }
    //header('Location: .');
    exit();

}

if (isset($_POST['joketext']))
{
    $joketext = mysqli_real_escape_string($connection, $_POST['joketext']);

    $sql = 'INSERT INTO joke SET
        joketext="' . $_POST['joketext'] . '",
        jokedate=CURDATE()';

    if (!mysqli_query($connection, $sql))
    {
        $error = 'Error adding submitted joke: ' . mysqli_error($connection);
        include 'error.html.php';
        exit();
    }
    header('Location: .');
    exit();
}




?>

为了避免在正边距上出现问题,你必须添加一列和一行mat <- matrix(c(1, 1, -1, 1, -1, 1, -1, 1, -1, 1, 1, -1), ncol=3) 作为边距(正数1在乘法时不会有问题,如果你总结它就必须例如,1。)。

0

现在你创建一个空矩阵来保存输出,然后迭代元素并乘以邻居。

mat2 <- addmargins(mat, FUN=function(x) 1)

导致:

out <- matrix(nrow=nrow(mat), ncol=ncol(mat))
for (i in 1:nrow(mat)) {
  for (j in 1:ncol(mat)) {
    out[i,j] <- prod(mat[i,j], mat2[i-1, j], mat2[i, j-1], mat2[i+1, j], mat2[i, j+1])
  }
}

对于300x300矩阵,这花了不到一秒钟,所以对你来说可能已经足够了。

答案 1 :(得分:2)

这应该可以解决问题:

ind <- which(x==x, arr.ind=TRUE) # index matrix

# find distances (need distances of 1 or 0) 
dist.mat <- as.matrix(dist(ind))
inds2mult <- apply(dist.mat, 1, function(ii) which(ii <= 1))

# get product of each list element in inds2mult
# and reform into appropriate matrix
matrix(
    sapply(inds2mult, function(ii) prod(unlist(x)[ii])),
    ncol=ncol(x))

#     [,1] [,2] [,3]
#[1,]   -1    1    1
#[2,]   -1    1   -1
#[3,]    1    1    1
#[4,]   -1    1   -1

要解决dist调用中大型矩阵的内存问题,您可以尝试fields.rdist.near包中的fields函数(delta值为1):

x <- matrix(rep(-1, 300*300), ncol=300)

ind <- which(x==x, arr.ind=TRUE) # index matrix

library(fields)
ind.list <- fields.rdist.near(ind, delta=1) # took my computer ~ 15 - 20 seconds

inds2mult <- tapply(ind.list$ind[,2], ind.list$ind[,1], list)

matrix(
    sapply(inds2mult, function(ii) prod(unlist(x)[ii])),
    ncol=ncol(x))

fields.rdist.near帮助页面中的delta参数:

  

门槛距离。所有被分开的点数更多   而忽略距离的delta。