我有一个像下面这样的dafarame:
<uses-permission android:name="android.permission.CAMERA" />
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>
<uses-feature android:name="android.hardware.camera" />
<uses-feature android:name="android.hardware.camera.autofocus" />
我想为每个名称分配一个唯一的df
Name1 Name2
0 John Jack
1 John Albert
2 Jack Eva
3 Albert Sara
4 Eva Sara
。所以:
ID
答案 0 :(得分:3)
首先用numpy.ravel
展平值,然后用原始df
整形,使用DataFrame构造函数并创建列名,最后join
到原始:
df1 = pd.DataFrame(pd.factorize(df.values.ravel())[0].reshape(df.shape))
df1.columns = ['ID{}'.format(x+1) for x in range(len(df1.columns))]
print (df1)
ID1 ID2
0 0 1
1 0 2
2 1 3
3 2 4
4 3 4
df = df.join(df1)
print (df)
Name1 Name2 ID1 ID2
0 John Jack 0 1
1 John Albert 0 2
2 Jack Eva 1 3
3 Albert Sara 2 4
4 Eva Sara 3 4
通过stack
创建MultiIndex Series
,通过factorize
创建id
,并为DataFrame
unstack
创建rename
列,并由join
添加到原始文件:
s = df.stack()
df = df.join(pd.Series(pd.factorize(s)[0], index=s.index)
.unstack()
.rename(columns=lambda x: x.replace('Name','ID')))
print (df)
Name1 Name2 ID1 ID2
0 John Jack 0 1
1 John Albert 0 2
2 Jack Eva 1 3
3 Albert Sara 2 4
4 Eva Sara 3 4
类似的选择:
s = df.stack()
s[:] = pd.factorize(s)[0]
df = df.join(s.unstack().rename(columns=lambda x: x.replace('Name','ID')))
print (df)
Name1 Name2 ID1 ID2
0 John Jack 0 1
1 John Albert 0 2
2 Jack Eva 1 3
3 Albert Sara 2 4
4 Eva Sara 3 4
答案 1 :(得分:1)
如果哪个名字获取哪个数字并不重要,您也可以考虑
df.join(df.stack().astype('category').cat.codes.unstack()
.rename(columns=lambda c: c.replace('Name', 'ID')))
产生
Name1 Name2 ID1 ID2
0 John Jack 3 2
1 John Albert 3 0
2 Jack Eva 2 1
3 Albert Sara 0 4
4 Eva Sara 1 4