Question

例如，我有两个datetime列：

col1 = [2019-01-01 03:00:00,
        2019-01-01 03:01:00,
        2019-01-01 03:02:00]

col2 = [2019-01-01 02:59:00, 
        2019-01-01 03:00:00, 
        2019-01-01 03:01:00, 
        2019-01-01 03:02:00, 
        2019-01-01 03:03:00]

每个索引的索引分别为[0，1，2]和[0，1，2，3，4]。

所以，我想得到的是[1、2、3]，它是col2的索引（与col1重叠的元素）。

下面是我的代码，它不起作用：

ind = []
for x in range(len(col1)):
    rw = np.where(col2 == col1[x])
    ind.append(int(rw[0]))

有没有简单的方法可以解决此问题？

Answer 1

使用enumerate的Oneliner：

[i for i, t in enumerate(col2) if t in col1]
# [1,2,3]

您也可以使用pandas.Series.isin：

import pandas as pd

col1 = pd.Series(["2019-01-01 03:00:00",
        "2019-01-01 03:01:00",
        "2019-01-01 03:02:00"])

col2 = pd.Series(["2019-01-01 02:59:00", 
        "2019-01-01 03:00:00", 
        "2019-01-01 03:01:00", 
        "2019-01-01 03:02:00", 
        "2019-01-01 03:03:00"])
col2.index[col2.isin(col1)].tolist()
# [1,2,3]

Answer 2

如果您不需要使用numpy来解决此问题，则可以遍历一个列表并检查每个元素是否在另一个列表中。

Using StatsBase
x = rand(1000)
bin_e = 0:0.1:1
h = fit(Histogram, x, bin_e)
yx = map((z) -> findnext(z.<=h.edges[1],1),x) .- 1

如何在python中具有不同项目数的列之间找到相同的索引？

2 个答案: