我有一个ID列表以及性别信息。我需要对至少有一位女性出现的ID进行分类。以下是供参考的数据。
ID Gender
1 Female
1 Female
2 Male
2 Male
3 Female
3 Male
4 Male
4 Male
4 Male
4 Male
4 Female
5 Female
5 Male
5 Female
6 Male
6 Male
6 Male
6 Male
7 Female
8 Male
8 Male
9 Male
10 Male
10 Male
11 Male
11 Female
13 Male
14 Male
我试图创建两列,如果ID相同,则创建一列,并检查另一列是否包含Female。基于两列结果,将创建输出。但是我认为他们会是更好的方法。
import re,os, subprocess, pandas as pd, numpy as np
data = pd.read_excel(r"C:\Analytics\TA Dashboard\test\test.xlsx")
data['match1'] =data['Reference ID'].eq(data['Reference ID'].shift())
data['match2'] =data.eq('Female').any(axis=1)
基于ID和性别的组合,输出必须为“是”或“否”,对于相同的ID,如果任何ID上都存在“女性”,则所有ID应该为“是”,否则为“否”。
ID Gender OUTPUT
1 Female Yes
1 Female Yes
2 Male NO
2 Male NO
3 Female Yes
3 Male Yes
4 Male Yes
4 Male Yes
4 Male Yes
4 Male Yes
4 Female Yes
5 Female Yes
5 Male Yes
5 Female Yes
6 Male NO
6 Male NO
6 Male NO
6 Male NO
7 Female YES
8 Male NO
8 Male NO
9 Male NO
10 Male NO
10 Male NO
11 Male Yes
11 Female Yes
13 Male NO
14 Male NO
答案 0 :(得分:1)
用groupby
检查{-# LANGUAGE OverloadedLists #-}
test :: XYZs
test = [XYZ 1, XYZ 2]
是#=========================================================================================================#
# INCLUDE DLIB LIBS #
#=========================================================================================================#
INCLUDEPATH += "F:\examinator\dlib-19.17_no_blas\install\include"
LIBS += -L"F:\examinator\dlib-19.17_no_blas\build"
LIBS += -ldlib
LIBS += -luser32 -lws2_32 -lgdi32 -lcomctl32 -limm32 -lwinmm
#=========================================================================================================#
# INCLUDE LIBPNG LIBS #
#=========================================================================================================#
INCLUDEPATH += "C:\Program Files (x86)\libpng\include"
LIBS += "C:\Program Files (x86)\libpng\lib\libpng.a"
#=========================================================================================================#
# INCLUDE LIBJPEG LIBS #
#=========================================================================================================#
INCLUDEPATH += "C:\Program Files (x86)\libjpeg\include"
LIBS += "C:\Program Files (x86)\libjpeg\lib\liblibjpeg.a"
#=========================================================================================================#
# INCLUDE ZLIB LIBS #
#=========================================================================================================#
INCLUDEPATH += "C:\Program Files (x86)\zlib\include"
LIBS += "C:\Program Files (x86)\zlib\lib\libzlibstatic.a"
,any
和Gender
的位置:
Female
答案 1 :(得分:0)
我在这里遇到了另一个问题...如果我必须在一个附加列“状态”上应用过滤器,然后应用以上逻辑,而不从数据集中删除过滤的行,该怎么办?
下面是数据,在这里我需要过滤状态不等于xyz和xy的地方,然后才应应用上面的逻辑。记住,我也不想从主数据源中删除筛选出的行。
ID性别状态 1女xyz 1女xyz 2男xyz 2男xy 3女x 3男y 4男xyz 4男xy 4男xy 4男xy 4女xab 5女xac 5男xy 5女xyz 6男xyz 6男xy 6男xy 6男xy 7女xyc 8男xy 8男xyz 9男xy