我有一个字典,键是年份,而值是相应的模型。下面是我从字典中打印出的一条数据。
1975: ['MODEL9808533471'],
1985: ['MODEL0912768548'],
1980: ['MODEL1006230072', 'MODEL7898438988'],
1987: ['MODEL0848444339'],
1977: ['MODEL7889395724'],
1962: ['MODEL8686121468'],
1965: ['MODEL0911532520'],
2018: ['MODEL1712050002', 'MODEL1712050003', 'MODEL1712050004']
我想要的东西如下:
1962 1965 1975 1977 1980 1985 1987 2018
MODEL9808533471 1
MODEL0912768548 1
MODEL1006230072 1
MODEL7898438988 1
MODEL0848444339 1
MODEL7889395724 1
MODEL8686121468 1
MODEL0911532520 1
MODEL1712050002 1
MODEL1712050003 1
MODEL1712050004 1
一开始,我认为我们需要循环字典的每个值并构建矩阵。然后,大熊猫将输出到一个csv文件。
我在numpy包中找不到类似的想法,尽管它对于处理矩阵很有效。我在我们的论坛中找到了this link,但列表的长度相同。
您知道有什么工具或设施(例如熊猫功能,numpy功能或类似功能)可以帮助我吗?
谢谢!
答案 0 :(得分:3)
完全适合MultiLabelBinarizer
中sklearn
的用法
from sklearn.preprocessing import MultiLabelBinarizer
s = pd.Series(d)
mlb = MultiLabelBinarizer()
yourdf=pd.DataFrame(mlb.fit_transform(s),columns=mlb.classes_, index=s.index).T
yourdf
Out[121]:
1975 1985 1980 1987 1977 1962 1965 2018
MODEL0848444339 0 0 0 1 0 0 0 0
MODEL0911532520 0 0 0 0 0 0 1 0
MODEL0912768548 0 1 0 0 0 0 0 0
MODEL1006230072 0 0 1 0 0 0 0 0
MODEL1712050002 0 0 0 0 0 0 0 1
MODEL1712050003 0 0 0 0 0 0 0 1
MODEL1712050004 0 0 0 0 0 0 0 1
MODEL7889395724 0 0 0 0 1 0 0 0
MODEL7898438988 0 0 1 0 0 0 0 0
MODEL8686121468 0 0 0 0 0 1 0 0
MODEL9808533471 1 0 0 0 0 0 0 0
或get_dummies
s.apply(','.join).str.get_dummies(',').T
Out[127]:
1975 1985 1980 1987 1977 1962 1965 2018
MODEL0848444339 0 0 0 1 0 0 0 0
MODEL0911532520 0 0 0 0 0 0 1 0
MODEL0912768548 0 1 0 0 0 0 0 0
MODEL1006230072 0 0 1 0 0 0 0 0
MODEL1712050002 0 0 0 0 0 0 0 1
MODEL1712050003 0 0 0 0 0 0 0 1
MODEL1712050004 0 0 0 0 0 0 0 1
MODEL7889395724 0 0 0 0 1 0 0 0
MODEL7898438988 0 0 1 0 0 0 0 0
MODEL8686121468 0 0 0 0 0 1 0 0
MODEL9808533471 1 0 0 0 0 0 0 0
答案 1 :(得分:1)
假设d
是您的字典,
df = pd.DataFrame(d.values(), index=d.keys()).stack().reset_index(level=0)
df.columns = ['year', 'col']
pd.crosstab(df['col'], df['year'])
year 1962 1965 1975 1977 1980 1985 1987 2018
col
MODEL0848444339 0 0 0 0 0 0 1 0
MODEL0911532520 0 1 0 0 0 0 0 0
MODEL0912768548 0 0 0 0 0 1 0 0
MODEL1006230072 0 0 0 0 1 0 0 0
MODEL1712050002 0 0 0 0 0 0 0 1
MODEL1712050003 0 0 0 0 0 0 0 1
MODEL1712050004 0 0 0 0 0 0 0 1
MODEL7889395724 0 0 0 1 0 0 0 0
MODEL7898438988 0 0 0 0 1 0 0 0
MODEL8686121468 1 0 0 0 0 0 0 0
MODEL9808533471 0 0 1 0 0 0 0 0