我希望将常规DataFrame放入嵌套的DataFrame中,然后最终将嵌套的DataFrame转换回字典中。
在Pandas中清除数据集后,数据集在DataFrame中的外观如下:
输入:df.head(5)
输出:
reviewerName title reviewerRatings
0 Charles Harry Potter Book Seven News:... 3.0
1 Katherine Harry Potter Boxed Set, Books... 5.0
2 Lora Harry Potter and the Sorcerer... 5.0
3 Cait Harry Potter and the Half-Blo... 5.0
4 Diane Harry Potter and the Order of... 5.0
接下来,我检查了数据集中唯一的审阅者姓名的数量:
输入:len(df['reviewerName'].uqinue())
输出:66130
现在,我正在尝试找到一种方法,以获取所有66130个唯一的reviewerName并将其全部分配为新的嵌套DataFrame中的 key ,然后分配值使用“ title”和“ reviewerRatings”作为嵌套DataFrame中key:value的另一层。
当我尝试查看显示了多少个第一唯一值时,我得到了:
输入:df[df['reviewerName'] == 'Charles G']
输出:
reviewerName title reviewerRatings
0 Charles Harry Potter Book Seven News:... 3.0
19156 Charles Harry Potter and the Half-Blo... 3.5
19156 Charles Harry Potter and the Order of... 4.0
我希望操纵DataFrame,以便它看起来像这样作为输出:
title reviewerRatings
Charles Harry Potter Book Seven News:... 3.0
Harry Potter and the Half-Blo... 3.5
Harry Potter and the Order of... 4.0
Katherine Harry Potter Boxed Set, Books... 5.0
Harry Potter and the Half-Blo... 2.5
Harry Potter and the Order of... 5.0
我尝试将三个列(reviewerName,title,reviewerRatings)分开,然后将这些项目串联在一起,但没有发现运气,如下所示:
输入:
p1 = df[['reviewerName']]
p2 = df[['title']]
p3 = df[['reviewerRatings']]
concatenated = pd.concat([p1,p2,p3], keys=list[p1.unqiue])
concatenated
输出:
AttributeError Traceback (most recent call last)
<ipython-input-106-5a6be8c1a3ba> in <module>()
----> 1 concatenated = pd.concat([p1,p2,p3], keys=list[p1.unqiue])
2 concatenated
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
4370 if self._info_axis._can_hold_identifiers_and_holds_name(name):
4371 return self[name]
-> 4372 return object.__getattribute__(self, name)
4373
4374 def __setattr__(self, name, value):
AttributeError: 'DataFrame' object has no attribute 'unqiue'
我也很幸运地研究了Pandas文档,不确定这里是否有人可以研究这个问题。
解决所需的输出后,我希望将嵌套的DataFrame转换为嵌套的Dictionary。
谢谢!