Question

我需要编写一个代码，最终将从excel表中存储的多个大数据表中读取坐标，但首先我想学习编写嵌套for循环来分析下面代码中的元组。

我所能找到的嵌套for循环没有这样的东西，所以我觉得在这里发布可能会很好。

我需要这个代码专门做的是获取file1中的第一个坐标并将其与file2中的每个坐标进行比较，然后将file1中的第二个坐标与file2中的每个坐标进行比较，依此类推，以循环遍历file1中的每个坐标到文件中的每个坐标，如果两者共享指定的接近度，则返回。

import math
file1 = ('1.36, 7.11', '1.38, 7.12', '1.5, -7.14', '8.7, 3.33', '8.5, 3.34', '8.63, -3.36')
file2 = ('1.46, 7.31', '1.47, 7.32', '1.49, -7.34', '8.56, 3.13', '8.57, 3.14', '8.59, -3.16')
dist = file1.apply(lambda row: math.hypot(row['x_diff'], row['y_diff']), axis=1)
for dist in file1:
    for dist in file2:
        if dist.values >= .5:
            print 'no match'
        elif dist.values <= .5:
            print True, dist

我的预感错误是我没有填写适当的命令来读取元组作为坐标。此外，我在这个陈述for dist in file1中对于我应该写的内容有很多困惑。我的意思是我应该打电话以及如何恰当地标记它。

我意识到这可能是一团糟但是，这是我的第一个编码项目，所以如果绝对任何人都可以帮助引导我朝着正确的方向或提供一些反馈，我可能需要更好地理解这里我会非常感激它。

Answer 1

一般的For循环：
您选择一个变量作为迭代器（可以是任何东西，但不应该是同时在别处使用的东西），它迭代迭代（例如列表）。在下面的示例中，i和j是迭代器，而range（10）是它们迭代的对象。在循环中，您可以编写想要重复的所有内容。在下面的示例中，我将每个可能的i / j组合附加到列表中。

嵌套 for循环要求您使用两个不同的变量。

示例：

whatever = []
for j in range(10):
    for i in range(10):
        whatever.append([j, i])

运行代码之后，看起来像什么：

[ [0, 0], [0, 1], [0,2] ,... [1, 0], [1, 1], ... [9, 9] ]

Answer 2

假设您将数据作为元组获取：

# convert file1 and file2 to lists of 2d points
# this is quite sloppy and I'll tidy it up when I get home from work
xs = [[float(pq.split(',')[0]),float(pq.split(',')[1])] for pq in list(file1)]
ys = [[float(pq.split(',')[0]),float(pq.split(',')[1])] for pq in list(file2)]
# generate a cartesian product of the two lists
cp = [(x,y) for x in xs for y in ys]
# generate distances
dists = map(lambda (x,y):math.hypot(x[0]-y[0],x[1]-y[1]),cp)
# loop through and find distances below some_threshold
for i in range(len(xs)):
    for j in range(1,len(ys)+1):
        if dists[i*j] > some_threshold:
            print i,j,dist
        else:
            print 'no match'

但是，如果您要阅读任何合理大小的数据集，我建议使用pandas或numpy。

Answer 3

您将元组表示为字符串，非常不便于使用。 “真正的”元组通常更好地开始。

file1 = [(1.36, 7.11), (1.38, 7.12), (1.5, -7.14), (8.7, 3.33)]
file2 = [(1.46, 7.31), (1.47, 7.32), (1.49, -7.34), (8.56, 3.13)]

接下来的问题是，我们如何获得这两个xy点之间的距离？为此，我们可以使用scipy.spatial.distance.euclidean作为一个函数，它接受两个元组并返回两者之间的向量的欧几里德范数。例如：

> import scipy.spatial.distance as distance
> distance.euclidean(file1[0], file2[0])
0.22360679774997827

现在，我们来讨论你的问题的核心：嵌套循环。逻辑如下。对于file1中的每个元素，比如说coord1，我们会将file2中的每个元素称为coord2并计算coord1和coord2之间的距离

for coord1 in file1:
    for coord2 in file2:
        dist = distance.euclidean(coord1, coord2) # do not forget to import distance before
        if dist < 0.5:
            print True, dist
        else:
            print 'no match'

我会在它们代表的变量之后命名变量。 file1是coordinate_list（first_coordinate_list，coordinate_list_1），元素是坐标，例如coordinate，coordinate_1，left_coordinate。

在tupled坐标上嵌套for循环

3 个答案: