查找最接近值的索引

时间:2016-03-31 10:22:08

标签: r approximate

我有两组坐标,试图找到最接近的坐标匹配。鉴于一个数据集由100万条记录组成,另一个数据集近50万条记录,寻找更好的方法来完成这项任务并需要建议。

第一个数据集的输入是

structure(list(longitude = c(-2.5168477762, -2.5972432832, -2.5936692407, 
-2.5943475677, -2.5923214528, -2.5919014869, -2.5913454553, -2.5835739992, 
-2.5673150195, -2.5683356381), latitude = c(51.4844052488, 51.45278562, 
51.4978889752, 51.4979844501, 51.4983813479, 51.4982126232, 51.4964350456, 
51.4123728037, 51.4266239227, 51.4265740193)), .Names = c("longitude", 
"latitude"), row.names = c(NA, 10L), class = "data.frame")

第二个数据集的输入是

structure(list(longitude = c(-3.4385392589, -3.4690321528, -3.2723981534, 
-3.3684012246, -3.329625956, -3.3093349806, 0.8718409198, 0.8718563602, 
0.8643998472, 0.8644153057), latitude = c(51.1931124311, 51.206897181, 
51.1271423704, 51.1618047221, 51.1805971356, 51.1663567178, 52.896084336, 
52.896092955, 52.9496082626, 52.9496168824)), .Names = c("longitude", 
"latitude"), row.names = 426608:426617, class = "data.frame")

我已经查看了R中的approx和findInterval函数,但对它们的工作方式并不完全了解它们。我要做的是从数据集1中获取坐标,并将它们与dataset2中的所有坐标相匹配,以找到最接近的匹配。目前我正在使用两个forloops,但由于数据的大小,它需要永远。

我试过的代码如下:

cns <- function(x,y)
{
 a = NULL
 b = NULL

for(i=1:nrow(x))  
{
  for(j=1:nrow(y)) 
  { 
      a[j]  = distm(c(x$longitude[i],x$latitude[i]),
                c(y$longitude[j],y$latitude[j]),
                fun = distVincentyEllipsoid)

  } 
  b[i] = which(a == min(a))
}
  return(y[b,])
}

上述函数从dataset1中取一个点并使用dataset2中的所有点计算距离,然后找到最小距离并返回该距离的坐标。

寻找可能是并行处理以在合适的时间内完成此任务。欢迎任何建议。

此致

1 个答案:

答案 0 :(得分:2)

1。尝试对代码进行矢量化

在R中,向量化通常比对于循环更有效:

  Unit: milliseconds
       expr      min       lq     mean   median       uq      max neval
  cns(x, y) 42.46518 45.16829 46.61517 46.45560 47.09023 80.25171   100
 cns2(x, y) 26.09484 27.33122 28.21505 28.07837 29.10225 30.74004   100

让我们评估差异:

cns3 <- function(x,y){
  b <- numeric(length = nrow(y))

  a<- distm(x=x,
              y=y,
              fun = distVincentyEllipsoid)

  b<-apply(X = a,MARGIN =  1, which.min) 
  return(y[b,])
}

结果:

    Unit: milliseconds
       expr      min       lq     mean   median       uq       max neval
  cns(x, y) 43.38928 45.69135 48.72223 46.70839 48.56951 135.80555   100
 cns2(x, y) 25.96674 27.15066 28.86999 28.43569 29.99138  35.86383   100
 cns3(x, y) 23.90187 24.84592 26.68738 25.87950 27.99075  34.71469   100

您已经将时间缩短了一半,没有并行计算。我们可以增加它吗?

> cns(x,y)
         longitude latitude
426613   -3.309335 51.16636
426613.1 -3.309335 51.16636
426613.2 -3.309335 51.16636
426613.3 -3.309335 51.16636
426613.4 -3.309335 51.16636
426613.5 -3.309335 51.16636
426613.6 -3.309335 51.16636
426613.7 -3.309335 51.16636
426613.8 -3.309335 51.16636
426613.9 -3.309335 51.16636
> cns2(x,y)
         longitude latitude
426613   -3.309335 51.16636
426613.1 -3.309335 51.16636
426613.2 -3.309335 51.16636
426613.3 -3.309335 51.16636
426613.4 -3.309335 51.16636
426613.5 -3.309335 51.16636
426613.6 -3.309335 51.16636
426613.7 -3.309335 51.16636
426613.8 -3.309335 51.16636
426613.9 -3.309335 51.16636
> cns3(x,y)
         longitude latitude
426613   -3.309335 51.16636
426613.1 -3.309335 51.16636
426613.2 -3.309335 51.16636
426613.3 -3.309335 51.16636
426613.4 -3.309335 51.16636
426613.5 -3.309335 51.16636
426613.6 -3.309335 51.16636
426613.7 -3.309335 51.16636
426613.8 -3.309335 51.16636
426613.9 -3.309335 51.16636

基准回报:

LinearLayout tmpLL = (LinearLayout) convertView.findViewById(R.id.llUpgrades);

        //remove previous list contents first
        tmpLL.removeAllViews();

        for(int i = 0; i<= tmpUpgradeList.size()-1; i++){

            ImageView tmpIB = new ImageView(getContext());
            Upgrade tmpUpgrade = tmpUpgradeList.get(i);
            Upgrade.setUpgradePic(tmpIB, tmpUpgrade, tmpUpgrade.Title()==null);
            tmpIB.setTag(position + ":" + i);
            tmpIB.setPadding(5, 0, 0, 0);
            tmpIB.setMaxWidth(50);

            tmpLL.addView(tmpIB);

            tmpIB.setOnClickListener(new View.OnClickListener() {
                @Override
                public void onClick(View v) {
                      String[] split = ((String) v.getTag()).split(":");
                     runUpgradePopup(Integer.parseInt(split[0]), Integer.parseInt(split[1]));
                }
            });

            tmpIB.setOnLongClickListener(new View.OnLongClickListener() {
                @Override
                public boolean onLongClick(View v) {
                     String[] split = ((String) v.getTag()).split(":");
                     clearUpgrade(Integer.parseInt(split[0]), Integer.parseInt(split[1]));
                     return true;
                }
            });

        }

所以cns3似乎要快一点,但是通过替换foreach可以很容易地并行化cns2。

这是对的吗?这三种方法提供相同的输出。

<RelativeLayout
    xmlns:android="http://schemas.android.com/apk/res/android"
    android:layout_width="fill_parent"
    android:layout_height="fill_parent"
    android:orientation="vertical"
    android:paddingTop="5dp"
    android:paddingBottom="5dp">


        <Button
            android:layout_width="100dp"
            android:layout_height="60dp"
            android:id="@+id/btnFRemoveShip"
            android:text="Remove"/>

        <ImageView
            android:id="@+id/ivFRowShipIcon"
            android:layout_height="60dp"
            android:layout_width="75dp"
            android:src="@android:drawable/ic_delete"
            android:layout_marginLeft="10dp"
            android:layout_toRightOf="@+id/btnFRemoveShip"/>


        <TextView
            android:layout_height="wrap_content"
            android:ems="10"
            android:layout_width="wrap_content"
            android:id="@+id/tvFRowShipTitle"
            android:text="error"
            android:textSize="20dp"
            android:layout_marginLeft="10dp"
            android:layout_toRightOf="@+id/ivFRowShipIcon"/>



        <HorizontalScrollView
            android:orientation="horizontal"
            android:layout_width="match_parent"
            android:layout_height="75dp"
            android:layout_marginTop="5dp"
            android:layout_below="@+id/btnFRemoveShip">

            <LinearLayout
                android:orientation="horizontal"
                android:layout_width="fill_parent"
                android:layout_height="fill_parent"
                android:id="@+id/llUpgrades">
            </LinearLayout>
        </HorizontalScrollView>


</RelativeLayout>

2。通常情况下,当被问到最小值时,在有联系时你想做什么?

按照你编写它的方式,你保持所有的联系,这可能是一个麻烦,因为b可能会被强制列入某个点。