Question

我有一个这样的数据框

import pandas as pd

df_test = pd.DataFrame({"ID": [912665, 455378, 938724, 557830
                         ],
                    "Company Name": ["112 ","112 ","SSS","SSS"
                            ],

                   "Date": ['2018-09-02 00:00:00','2019-02-27 00:00:00','2019-05-05 00:00:00','2018-03-21 00:00:00' 
                        ],
                    "Type": ['Type1','Type2','Type1','Type2' 
                        ],
                   "ngroup": [0, 0,1,1]}

                      )

df_test

我需要在每个'ngroup'0,1 ...内按日期进行比较（如果需要，还可以对其他任何列进行比较）。

在此示例中，我将第0组和第1组称为ngroup。在每个组中，每个组只有两行。公司类型称为类型，如类型1和类型2 我需要检查类型1的日期是否大于类型2的日期。如果是，那么我要说例如“类型1首先加入”，如果不是，则类型2首先加入。

在此之后，我还想将其添加到我的初始数据框中，作为新的列状态。

UPD：所以我的预期结果是喜欢这个

df_test_expected_result = pd.DataFrame({"ID": [912665, 455378, 938724, 557830
                         ],
                    "Company Name": ["112 ","112 ","SSS","SSS"
                            ],

                   "Date": ['2018-09-02 00:00:00','2019-02-27 00:00:00','2019-05-05 00:00:00','2018-03-21 00:00:00' 
                        ],
                    "Type": ['Type1','Type2','Type1','Type2' 
                        ],
                   "ngroup": [0, 0,1,1],
                    "expected_result": ["Type 1 joined first","Type 1 joined first","Type 2 joined first","Type 2 joined first" ]
                                       }

                      )
df_test_expected_result

达到此结果的最佳方法是什么？

Answer 1

IIUC，我们需要一个比较布尔值来对每个组进行测试。

#ifndef Bar_hpp
#define Bar_hpp

#include <stdio.h>
#include "Foo.hpp"

class Bar
{
private:
    int y;
public:
    friend void Foo::addY(Bar&);//use of undeclared identifier 'Foo'
};
#endif /* Bar_hpp */

编辑，只看到您的预期输出，我们可以应用您的第一个条件，然后按组进行转发和回填。

bool_comp = df_test.groupby(['ngroup'])['Date'].transform('min')

df_test["res"] = np.where(
    df_test["Date"] <= bool_comp,
    df_test["Type"] + " Joined First",
    df_test["Type"] + " Joined Later",
)

print(df_test)

       ID Company Name       Date   Type  ngroup                 res
0  912665         112  2018-09-02  Type1       0  Type1 Joined First
1  455378         112  2019-02-27  Type2       0  Type2 Joined Later
2  938724          SSS 2019-05-05  Type1       1  Type1 Joined Later
3  557830          SSS 2018-03-21  Type2       1  Type2 Joined First

在一组python熊猫中进行比较

1 个答案: