假设我有以下数据框(尽管我实际使用的数据框超过100行):
>> df
a b c d e
title0 1 0 0 string
title1 0 1 1 string
对于每一行,我想:
输出应为:
>> df
a b c d e
title0 1 0 0 string
title1 0 1 0 string
title1 0 0 1 string
答案 0 :(得分:1)
您可以尝试在第1轴与1轴重复的地方插入行,然后根据其长度用identity matrix
np.identity(len(df))
替换重复的1
df
a b c d e
0 title0 1 0 0 string1
1 title1 0 1 1 string2
2 title2 1 1 1 string3
3 title3 1 1 0 string4
def fun(x):
# Assign numpy identity matrix inplace of duplicated indexes
x.loc[x[x.eq(1)].dropna(axis=1).index,x[x.eq(1)].dropna(axis=1).columns] = np.identity(len(x))
return x
# Imputing rows w.r.t to the duplication of 1's count
for i,j in zip(range(len(df)),df[['b','c','d']].sum(axis=1).values):
if i>0:
df = df.append([df.loc[i]]*(j-1)).reset_index(drop = True)
df.groupby(['a']).apply(fun)
出局:
a b c d e
0 title0 1.0 0.0 0.0 string1
1 title1 0.0 1.0 0.0 string2
2 title2 1.0 0.0 0.0 string3
3 title3 1.0 0.0 0.0 string4
4 title1 0.0 0.0 1.0 string2
5 title2 0.0 1.0 0.0 string3
6 title2 0.0 0.0 1.0 string3
7 title3 0.0 1.0 0.0 string4
答案 1 :(得分:1)
想法是get_dummies
的使用:
print (df)
a b c d e
0 title0 1 0 0 string1
1 title1 0 1 1 string2
2 title2 1 1 1 string3
3 title3 1 1 0 string4
#filter all columns without a and e
cols = df.columns.difference(['a','e'])
#or set columns names by list
#cols = ['b', 'c', 'd']
print (cols)
Index(['b', 'c', 'd'], dtype='object')
#filter columns and reshape to Series, filter only values by 1
s = df[cols].stack()
df1 = pd.get_dummies(s[s == 1].reset_index(level=1).drop(0, axis=1), prefix='', prefix_sep='')
print (df1)
b c d
0 1 0 0
1 0 1 0
1 0 0 1
2 1 0 0
2 0 1 0
2 0 0 1
3 1 0 0
3 0 1 0
#last remove original columns, join new df and for same order use reindex
df = df.drop(cols, axis=1).join(df1).reindex(columns=df.columns).reset_index(drop=True)
print (df)
a b c d e
0 title0 1 0 0 string1
1 title1 0 1 0 string2
2 title1 0 0 1 string2
3 title2 1 0 0 string3
4 title2 0 1 0 string3
5 title2 0 0 1 string3
6 title3 1 0 0 string4
7 title3 0 1 0 string4
答案 2 :(得分:0)
<td align="center" style="font-size=8pt">
<xsl:choose>
<xsl:when test="esp:DocType/@v = 'T2' and esp:BusType/@v = '44'">first</xsl:when>
<xsl:otherwise>
<xsl:value-of select="esp:AccPnt/@v"/>
</xsl:otherwise>
</xsl:choose>
</td>