我想将所有资格(作为分隔符分隔列表)与作业标题分组。
在以下数据集中,相同类型的作业(.net开发人员)需要不同的资格集,而另一个作业不需要任何资格。
JobID Job Title Qualification ID Qualification Name
34455226 .Net Developer ICT50715 Diploma of Software Development
34455226 .Net Developer ICT40515 Certificate IV in Programming
34466933 .Net Developer ICT50715 Diploma of Software Development
34466111 .Net Developer ICT50655 Diploma of Software Testing
34479964 Snr Finance Systems Analyst
我希望获得特定类型工作可能需要的所有独特资格的综合视图,如下所示
Job Title Qualifications
.Net Developer Diploma of Software Development,Certificate IV in Programming,Diploma of Software Testing
Snr Finance Systems Analyst N/A
这是我到目前为止所做的。
def f(x):
return pd.Series(dict(Qualifications = ",".join(map(str, x["Qualification Name"]))))
df_jobs_qualifications\
.groupby("Job Title")[['Qualification Name']]\
.apply(f)
但它给了我重复的资格名称(见下文 - 软件开发文凭重复),而我想要独特的资格名称
Job Title Qualifications
.Net Developer Diploma of Software Development,Certificate IV in Programming,Diploma of Software Development,Diploma of Software Testing
Snr Finance Systems Analyst N/A
更新
我的问题与this问题有所不同,因为即使遵循前面提到的问题中提到的步骤,我也没有获得唯一值
答案 0 :(得分:5)
如果需要唯一字符串 s:
您可以添加Class School extents React.Component {
onDragStartCircle = (e) {
//taking the initial state
}
onDragCircle = () {
// draging the element
}
onDragEndCircle = () {
// saving data to the database
}
render() {
return (
<div>
<svg>
<circle
cx={50}
cy={50}
r={10}
fill="red"
onMouseDown={this.onDragStartCircle}
onMouseMove={this.onDragCircle}
onMouseUp={this.onDragEndCircle}
/>
</svg>
</div>
);
}
}
或unique
,如果可能,添加set
或None
添加dropna
:
NaN
如果订单很重要:
df1 = (df.groupby('Job Title')['Qualification Name']
.apply(lambda x: ','.join(set(x.dropna())))
.reset_index())
print (df1)
Job Title \
0 .Net Developer
1 Snr Finance Systems Analyst
Qualification Name
0 Diploma of Software Development,Diploma of Sof...
1
如果想要df1 = (df.groupby('Job Title')['Qualification Name']
.apply(lambda x: ','.join(x.dropna().unique()))
.reset_index())
print (df1)
Job Title \
0 .Net Developer
1 Snr Finance Systems Analyst
Qualification Name
0 Diploma of Software Development,Certificate IV...
1
s没有值:
NaN
如果需要唯一列表 s:
def f(x):
val = set(x.dropna())
if len(val) > 0:
val = ','.join(val)
else:
val = np.nan
return val
df2 = df.groupby('Job Title')['Qualification Name'].apply(f).reset_index()
print (df2)
Job Title \
0 .Net Developer
1 Snr Finance Systems Analyst
Qualification Name
0 Diploma of Software Development,Diploma of Sof...
1 NaN