我有一个像
这样的列的CSVcheckwinsize
我想把它拆成像:
LABEL
a
b
a
a
c
n o
ye s
如何用熊猫做这样的事情?
答案 0 :(得分:3)
使用get_dummies
s.str.get_dummies().add_prefix('label_')
Out[19]:
label_a label_b label_c label_n o label_ye s
0 1 0 0 0 0
1 0 1 0 0 0
2 1 0 0 0 0
3 1 0 0 0 0
4 0 0 1 0 0
5 0 0 0 1 0
6 0 0 0 0 1
答案 1 :(得分:3)
让我们使用pd.get_dummmies
参数prefix
:
#Using @Lambda setup
label = ["a", "b", "a", "a", "c", "n o", "ye s"]
s = pd.Series(label)
pd.get_dummies(s, prefix='label')
输出:
label_a label_b label_c label_n o label_ye s
0 1 0 0 0 0
1 0 1 0 0 0
2 1 0 0 0 0
3 1 0 0 0 0
4 0 0 1 0 0
5 0 0 0 1 0
6 0 0 0 0 1
> %%timeit for key in keys:
> df[("label_%s" % key).replace(" ", "_")] = (s == key).astype(int)
100个循环,最好3:每循环6.7毫秒
> %timeit s.str.get_dummies().add_prefix('label_')
带有前缀参数的100个循环,最佳3:每循环6.03毫秒
> %timeit pd.get_dummies(s, prefix='label')
1000次循环,最佳3:每循环1.77 ms
答案 2 :(得分:1)
import pandas as pd
label = ["a", "b", "a", "a", "c", "n o", "ye s"]
s = pd.Series(label)
keys = s.unique()
df = pd.DataFrame()
for key in keys:
df[("label_%s" % key).replace(" ", "_")] = (s == key).astype(int)