我需要实现距离搜索代码。我在CSV中输入如下。
Proprty_ID, lat, lon
123, 33.84, -118.39
234, 35.89, -119.48
345, 35.34, -119.39
我有一个半正式公式,它采用2个坐标(lat1, lon1), (lat2, lon2)
并返回距离。让我们说:
val Distance: Double = haversine(x1:Double, x2:Double, y1:Double, y2:Double)
我需要找出每个房产之间的距离。所以输出看起来像这样。
Property_ID1, Property_ID2, distance
123,123,0
123,234,0.1
123,345,0.6
234,234,0
234,123,0.1
234,345,0.7
345,345,0
345,123,0.6
345,234,0.7
如何在Scala中实现它?
import math._
object Haversine {
val R = 6372.8 //radius in km
def haversine(lat1:Double, lon1:Double, lat2:Double, lon2:Double)={
val dLat=(lat2 - lat1).toRadians
val dLon=(lon2 - lon1).toRadians
val a = pow(sin(dLat/2),2) + pow(sin(dLon/2),2) * cos(lat1.toRadians) * cos(lat2.toRadians)
val c = 2 * asin(sqrt(a))
R * c
}
def main(args: Array[String]): Unit = {
println(haversine(36.12, -86.67, 33.94, -118.40))
}
}
class SimpleCSVHeader(header:Array[String]) extends Serializable {
val index = header.zipWithIndex.toMap
def apply(array:Array[String], key:String):String = array(index(key))
}
val lat1=33.84
val lon1=-118.39
val csv = sc.textFile("file.csv")
val data = csv.map(line => line.split(",").map(elem => elem.trim))
val header = new SimpleCSVHeader(data.take(1)(0)) // we build our header with the first line
val rows = data.filter(line => header(line,"lat") != "lat") // filter the header out
// I will do the looping for all properties here but I am trying to get the map function right for one property at least
val distances = rows.map(x => haversine(x.take(1)(0).toDouble,x.take(1)(1).toDouble, lat1,lon1)`
现在这应该给我(lat1, lon1)
所有属性的距离。我知道它不对,但我无法从这里思考。
答案 0 :(得分:0)
我试着把它分解成几步。给出如下数据:
val rows = List(Array("123", "33.84", "-118.39"),
Array("234", "35.89", "-119.48"),
Array("345", "35.34", "-119.39"))
首先转换类型:
val typed = rows.map{ case Array(id, lat, lon) => (id, lat.toDouble, lon.toDouble)}
然后生成组合:
val combos = for {
a <- typed
b <- typed
} yield (a,b)
然后为每个组合生成输出行:
combos.map{ case ((id1, lat1, lon1), (id2, lat2, lon2))
=> id1 + "," + id2 + "," + haversine(lat1, lon1, lat2, lon2)} foreach println