我正在处理两组相同大小的三个大型列表,其中包含UTM格式的经度,纬度和高度坐标(参见下面的列表)。阵列包含重叠坐标(即经度和纬度值相等)。如果Lon中的值等于Lon2并且Lat中的值等于Lat2,那么我想计算那些索引处的平均高度。但是,如果它们不相等,那么经度,纬度和海拔高度值将保持不变。我只想将重叠数据替换为一组经度和纬度坐标,并计算这些坐标处的平均值。
这是我到目前为止的尝试
import numpy as np
Lon = [450000.50,459000.50,460000,470000]
Lat = [5800000.50,459000.50,500000,470000]
Alt = [-1,-9,-2,1]
Lon2 = [450000.50,459000.50,460000,470000]
Lat2 = [5800000.50,459000.50,800000,470000]
Alt2= [-3,-1,-20,2]
MeanAlt = []
appendAlt = MeanAlt.append
LonOverlap = []
appendLon = LonOverlap.append
LatOverlap = []
appendLat = LatOverlap.append
for i, a in enumerate(Lon and Lat and Alt):
for j, b in enumerate(Lon2 and Lat2 and Alt2):
if Lon[i]==Lon2[j] and Lat[i]==Lat2[j]:
MeanAltData = (Alt[i]+Alt2[j])/2
appendAlt(MeanAltData)
LonOverlapData = Lon[i]
appendLat(LonOverlapData)
LatOverlapData = Lat[i]
appendLon(LatOverlapData)
print(MeanAlt) # correct ans should be MeanAlt = [-2.0,-5,1.5]
print(LonOverlap)
print(LatOverlap)
我在一个jupyter笔记本电脑上工作,我的笔记本电脑很慢,所以我需要使这个代码更有效率。我将不胜感激任何帮助。谢谢:))
答案 0 :(得分:1)
我相信您的代码可以通过两种方式得到改进:
tuples
代替lists
,而不是迭代元组是generally faster而不是迭代列表。 for
循环可以简化为只有一个循环,该循环遍历您要读取的元组的索引。当然,当且仅当所有元组包含相同数量的项目(即:len(Lat) == len(Lon) == len(Alt) == len(Lat2) == len(Lon2) == len(Alt2)
)时,此假设才成立。这是改进的代码(我冒昧地删除import numpy
语句,因为它没有在你提供的代码中使用):
# use of tuples
Lon = (450000.50, 459000.50, 460000, 470000)
Lat = (5800000.50, 459000.50, 500000, 470000)
Alt = (-1, -9, -2, 1)
Lon2 = (40000.50, 459000.50, 460000, 470000)
Lat2 = (5800000.50, 459000.50, 800000, 470000)
Alt2 = (-3, -1, -20, 2)
MeanAlt = []
appendAlt = MeanAlt.append
LonOverlap = []
appendLon = LonOverlap.append
LatOverlap = []
appendLat = LatOverlap.append
# only one loop
for i in range(len(Lon)):
if (Lon[i] == Lon2[i]) and (Lat[i] == Lat2[i]):
MeanAltData = (Alt[i] + Alt2[i]) / 2
appendAlt(MeanAltData)
LonOverlapData = Lon[i]
appendLat(LonOverlapData)
LatOverlapData = Lat[i]
appendLon(LatOverlapData)
print(MeanAlt) # correct ans should be MeanAlt = [-2.0,-5,1.5]
print(LonOverlap)
print(LatOverlap)
我在笔记本电脑上执行了这个程序100万次。按照我的代码,所有执行所需的时间为: 1.41秒。另一方面,根据您的方法,所花费的时间是: 4.01秒。
答案 1 :(得分:1)
这不是100%在功能上等同,但我猜它更接近你真正想要的东西:
Lon = [450000.50,459000.50,460000,470000]
Lat = [5800000.50,459000.50,500000,470000]
Alt = [-1,-9,-2,1]
Lon2 = [40000.50,459000.50,460000,470000]
Lat2 = [5800000.50,459000.50,800000,470000]
Alt2= [-3,-1,-20,2]
MeanAlt = []
appendAlt = MeanAlt.append
LonOverlap = []
appendLon = LonOverlap.append
LatOverlap = []
appendLat = LatOverlap.append
ll = dict((str(la)+'/'+str(lo), al) for (la, lo, al) in zip(Lat, Lon, Alt))
for la, lo, al in zip(Lon2, Lat2, Alt2):
al2 = ll.get(str(la)+'/'+str(lo))
if al2:
MeanAltData = (al+al2)/2
appendAlt(MeanAltData)
LonOverlapData = lo
appendLat(LonOverlapData)
LatOverlapData = la
appendLon(LatOverlapData)
print(MeanAlt) # correct ans should be MeanAlt = [-2.0,-5,1.5]
print(LonOverlap)
print(LatOverlap)
或更简单:
Lon = [450000.50,459000.50,460000,470000]
Lat = [5800000.50,459000.50,500000,470000]
Alt = [-1,-9,-2,1]
Lon2 = [40000.50,459000.50,460000,470000]
Lat2 = [5800000.50,459000.50,800000,470000]
Alt2= [-3,-1,-20,2]
ll = dict((str(la)+'/'+str(lo), al) for (la, lo, al) in zip(Lat, Lon, Alt))
result = []
for la, lo, al in zip(Lon2, Lat2, Alt2):
al2 = ll.get(str(la)+'/'+str(lo))
if al2:
result.append((la, lo, (al+al2)/2))
print(result)
在实践中,我会尝试从更好的结构化输入数据开始,转换为dict,或者至少不需要“zip()”。
答案 2 :(得分:0)
使用numpy
来矢量化计算。对于1,000,000个长阵列,如果输入已经是numpy.ndarray
,则执行时间应为15-25毫微秒,如果输入是Python列表,则执行时间应为~140毫秒。
import numpy as np
def mean_alt(lon, lon2, lat, lat2, alt, alt2):
lon = np.asarray(lon)
lon2 = np.asarray(lon2)
lat = np.asarray(lat)
lat2 = np.asarray(lat2)
alt = np.asarray(alt)
alt2 = np.asarray(alt2)
ind = np.where((lon == lon2) & (lat == lat2))
mean_alt = (0.5 * (alt[ind] + alt2[ind])).tolist()
return (lon[ind].tolist(), lat[ind].tolist(), mean_alt)