我有一个这样的数据框,
DataFrame_A
Employee ID A_ Status C_Code TestCol Result_A Result_B
20000 Yes USA asdasdq True False
20001 No BRA asdasdw True True
200002 USA asdasda True True
200003 asda MEX asdasar False False
在此数据框中,Result_A和Result_B是布尔列。
我想通过一个函数构建一个摘要数据框,以便我可以重复使用。
我需要在数据框中添加以下列,并且Result_A的输出如下所示,而Result_B的另一个布尔值列将是摘要数据框的下一行。 < / p>
Name of the Column No. of Records No. of Employees True_Records False_Records A_Status_Yes A_Status_No Mex_True Mex_False USA_True USA_False
Result_A 4 4 3 1 1 1 0 1 2 2
还要注意,员工ID有时可能是EMPLOYEE ID或Employee_ID或EMPLOYEE_ID或EMPL_ID。因此,列表需要在python内部,并且其中只有一个会出现在函数内部
我实时拥有25个数据帧,因此正在寻找可以重用和附加的功能。
请帮助我。
答案 0 :(得分:1)
我想我得到了你想要的:
1-重新创建您的df
:
df = pd.DataFrame({"Employee ID": [20000, 20001, 200002, 200003],
"A_ Status": ["Yes", "No", np.nan, "asda"],
"C_Code": ["USA", "BRA", "USA", "MEX"],
"TestCol": ["asdasdq", "asdasdw", "asdasda", "asdasar"],
"Result_A": [True, True, True, False],
"Result_B": [False, True, True, False]},
columns=["Employee ID", "A_ Status", "C_Code", "TestCol", "Result_A", "Result_B"])
2-创建第二个数据框df2
:
df2 = pd.DataFrame(columns=["Name of the Column","No. of Records","No. of Employees","True_Records","False_Records","A_Status_Yes","A_Status_No","Mex_True","Mex_False","USA_True","USA_False"])
3-计算结果:
for column in df.columns[4:]: # For each columns of name pattern `Result_xx`
print(column)
a = [column,
len(df["Employee ID"]), # Not sure about this one
len(df["Employee ID"]),
len(df[df[column] == True]),
len(df[df[column] == False]),
len(df[df["A_ Status"] == "Yes"]),
len(df[df["A_ Status"] == "No"]),
len(df[(df["C_Code"] == "MEX") & (df[column] == True)]),
len(df[(df["C_Code"] == "MEX") & (df[column] == False)]),
len(df[(df["C_Code"] == "USA") & (df[column] == True)]),
len(df[(df["C_Code"] == "USA") & (df[column] == False)])
] # Create line as list
df2.loc[len(df2), :] = a # Append line
4-结果:
+----+----------------------+------------------+--------------------+----------------+-----------------+----------------+---------------+------------+-------------+------------+-------------+
| | Name of the Column | No. of Records | No. of Employees | True_Records | False_Records | A_Status_Yes | A_Status_No | Mex_True | Mex_False | USA_True | USA_False |
|----+----------------------+------------------+--------------------+----------------+-----------------+----------------+---------------+------------+-------------+------------+-------------|
| 0 | Result_A | 4 | 4 | 3 | 1 | 1 | 1 | 0 | 1 | 2 | 0 |
| 1 | Result_B | 4 | 4 | 2 | 2 | 1 | 1 | 0 | 1 | 1 | 1 |
+----+----------------------+------------------+--------------------+----------------+-----------------+----------------+---------------+------------+-------------+------------+-------------+