根据比较比较无序DataFrame比较值并创建新列

时间:2021-02-26 09:53:10

标签: python pandas

我有 2 个 DataFrame(PreServices、PostServices),每个 DataFrame 包含 Windows 服务及其在给定时间的运行状态。

数据是什么样子的?

  1. 没有列出服务名称的顺序
  2. PostServices 可能有也可能没有 PreServices 中的名称
  3. PostServices 的名称可能不在 PreServices 中

我想在 PreServices 中创建一个名为“最终状态”的新列,其值应为:

  1. 对于 PreService 中的每个 Name,如果 PostServices 中的 Name 和 State 是相同的“最终状态”== True
  2. 对于 PreService 中的每个 Name,如果 PostServices 中的 Name 和 State 不是相同的 'final status' = PostServices['State']
  3. 对于 PreService 中的每个 Name 不在 PostService 'final status' = False
<块引用>

前期服务

                           Name    State
0                         VMTools  Running
1                             LSM  Running
2                        macmnsvc  Running
3    VMwareCAFManagementAgentHost  Running
4                          sppsvc  Stopped
5               LanmanWorkstation  Running
6                          MpsSvc  Running
7                           MSDTC  Running
8                            MSMQ  Running
<块引用>

邮政服务

                             Name    State
0                        macmnsvc  Running
1                             LSM  Running
2                         VMTools  Stopped
3    VMwareCAFManagementAgentHost  Running
4                          sppsvc  Stopped
5               LanmanWorkstation  Running
6                          MpsSvc  Running
7                             xlp  Running
<块引用>

输出

                           Name    State     final status
0                         VMTools  Running   Stopped
1                             LSM  Running   True
2                        macmnsvc  Running   True
3    VMwareCAFManagementAgentHost  Running   True
4                          sppsvc  Stopped   True
5               LanmanWorkstation  Running   True
6                          MpsSvc  Running   True
7                           MSDTC  Running   False
8                            MSMQ  Running   False

1 个答案:

答案 0 :(得分:1)

以下代码片段将获得您想要的输出:

def create_final_status(row):
    if row['Name'] in PostServices['Name'].values:
        if row['State'] == PostServices[PostServices['Name'] == row['Name']]['State'].item():
            return True
        else:
            return PostServices[PostServices['Name'] == row['Name']]['State']
    else:
        return False

PreServices['final status'] = PreServices.apply(lambda row: create_final_status(row), axis = 1)

PreServices 数据框现在看起来像这样:

+----+------------------------------+---------+----------------+
|    | Name                         | State   | final status   |
|----+------------------------------+---------+----------------|
|  0 | VMTools                      | Running | Stopped        |
|  1 | LSM                          | Running | True           |
|  2 | macmnsvc                     | Running | True           |
|  3 | VMwareCAFManagementAgentHost | Running | True           |
|  4 | sppsvc                       | Stopped | True           |
|  5 | LanmanWorkstation            | Running | True           |
|  6 | MpsSvc                       | Running | True           |
|  7 | MSDTC                        | Running | False          |
|  8 | MSMQ                         | Running | False          |
+----+------------------------------+---------+----------------+