如何遍历两个数据框并基于比较列分配新值?

时间:2020-08-23 05:55:16

标签: python pandas dataframe loops iteration

我有两个不同的数据框:A,B。“事件”列具有相似的数据,我将使用它们来比较两个数据框。我想给Dataframe A一个新列dfA.newContext#。

为此,我需要使用“事件”列。我想遍历数据框A以找到事件的匹配项,并将dfB.context#分配给dfA.newContext#

我认为循环是最好的方法,因为我需要检查一些条件。

这可能要问很多,但我真的很困..我想做这样的事情,因为我需要考虑偏移量以确定哪个“特殊”事件和对应的名称。使用:

offset = 0
Iterate through dfA:
    get dfA.event
    get dfA.context#
        Iterate through dfB:
            if dfB.event == dfA.event:
                dfA.newContext# = dfB.context#
                offset = dfA.new_context# - dfA.context#
                if dfB.event == "Special":
                    dfA.newContext# = dfA.context# - offset
 

数据框A

+-------------+---------+------+
|dfA.context# |dfA.event| Name |
+-------------+---------+------+
| 1465        | Bobcat  | Allie|
| 1466        | Turkey  |Duncan|
| 1467        | Cobra   |Taylor|
| 1468        | Horse   |Heath |
| 1469        | Whale   | Hank |
| 1470        | Special | Emma |
| 1471        | Special | Mark |
| 1472        | Special | John |
| 1473        | Special | Terr |
| 1474        | Special | Bryl |
| 1475        | Special | Joe  |
| 1476        | Special | Jamie|
| 1477        | Special | Alex |
| 1478        | Special | Sandy|
| 1479        | Special | Bruce|
| 1480        | Special | Jared|
| 1481        | Special |Hayley|
| 1482        | Special |George|
| 1483        | Special | Anna |
| 1484        | Special | Greg |
| 1485        | Special | Hans |
| 1486        | Special | Ellie|
| 1487        | Special | Tere |
| 1488        | Special | Sara |
| 1490        | Special | Mike |
| 1492        | Special | Joel |
| 1494        | Special | Tony |
| 1496        | Special | Bryce|
| 1498        | Special | Cynth|
| 1500        | Special | Kyle |
| 1501        | Special |Taylor|
| 1502        | Special | Belle|
| 1503        | Toucan  | Dale |
| 1504        | Dolphin | Tim  |
| 1505        | Zebra   | Ham  |
| 1506        | Hyena   | Griff|
| 1507        | Shark   |Charli|
| 1508        | Rat     | Blake|
+-------------+---------+------+

数据框B

+-------------+---------+
|dfB.context# |dfB.event|
+-------------+---------+
| 1459        | Dog     |
| 1463        | Bobcat  |
| 1469        | Special |
| 1479        | Special |
| 1486        | Special |
| 1489        | Special |
| 1491        | Special |
| 1493        | Special |
| 1495        | Toucan  |
| 1497        | Zebra   |
| 1499        | Shark   |
+-------------+---------+

所需DF

+-------------+---------+------+---------------+
|dfA.context# |dfA.event| Name |dfA.newContext#|
+-------------+---------+------+---------------+
| 1465        | Bobcat  | Allie| 1463          | (offset of 2)
| 1466        | Turkey  |Duncan|               |
| 1467        | Cobra   |Taylor|               |
| 1468        | Horse   |Heath |               |
| 1469        | Whale   | Hank |               |
| 1470        | Special | Emma |               |
| 1471        | Special | Mark | 1469          | (offset of 2)
| 1472        | Special | John |               |
| 1473        | Special | Terr |               |
| 1474        | Special | Bryl |               |
| 1475        | Special | Joe  |               |
| 1476        | Special | Jamie|               |
| 1477        | Special | Alex |               |
| 1478        | Special | Sandy|               |
| 1479        | Special | Bruce|               |
| 1480        | Special | Jared|               |
| 1481        | Special |Hayley| 1479          | (offset of 2)
| 1482        | Special |George|               |
| 1483        | Special | Anna |               |
| 1484        | Special | Greg |               |
| 1485        | Special | Hans |               |
| 1486        | Special | Ellie|               |
| 1487        | Special | Tere |               |
| 1488        | Special | Sara | 1486          | (offset of 2)
| 1490        | Special | Mike | 1489          | (offset of 1)
| 1492        | Special | Joel | 1491          | (offset of 1)
| 1494        | Special | Tony | 1493          | (offset of 1)
| 1496        | Special | Bryce|               |
| 1498        | Special | Cynth|               |
| 1500        | Special | Kyle |               |
| 1501        | Special |Taylor|               |
| 1502        | Special | Belle|               |
| 1503        | Toucan  | Dale | 1495          | (offset of 8)
| 1504        | Dolphin | Tim  |               |
| 1505        | Zebra   | Ham  | 1497          | (offset of 8)
| 1506        | Hyena   | Griff|               |
| 1507        | Shark   |Charli| 1499          | (offset of 8)
| 1508        | Rat     | Blake|               |
+-------------+---------+------+---------------+

我能想到的唯一方法是使用偏移量,但是除了通过两个循环外,我不确定如何进行处理。我将需要参考先前的偏移量以获得正确的“特殊”事件。 您可以看到dfB.context = 1489,dfA.context没有1491,因此它将默认为dfA.context = 1490的下一个可用事件,将偏移量标记为1,然后将偏移量用于下一次匹配。这种情况一直持续到dfA.context = 1503,因为dfB.Event与dfA.Event Toucan相匹配,这时我们将偏移量标记为8。 基本上,我们基于事件进行匹配,但是如果同一事件存在多个,我们选择使用我们已记录的先前匹配的偏移量。

0 个答案:

没有答案