过滤来自常规DataFrame

时间:2019-01-16 02:20:46

标签: python pandas dictionary dataframe nested

我希望将常规DataFrame放入嵌套的DataFrame中,然后最终将嵌套的DataFrame转换回字典中。

在Pandas中清除数据集后,数据集在DataFrame中的外观如下:

输入:df.head(5)

输出:

    reviewerName    title                               reviewerRatings
0   Charles         Harry Potter Book Seven News:...    3.0
1   Katherine       Harry Potter Boxed Set, Books...    5.0
2   Lora            Harry Potter and the Sorcerer...    5.0
3   Cait            Harry Potter and the Half-Blo...    5.0
4   Diane           Harry Potter and the Order of...    5.0

接下来,我检查了数据集中唯一的审阅者姓名的数量:

输入:len(df['reviewerName'].uqinue())

输出:66130

现在,我正在尝试找到一种方法,以获取所有66130个唯一的reviewerName并将其全部分配为新的嵌套DataFrame中的 key ,然后分配使用“ title”和“ reviewerRatings”作为嵌套DataFrame中key:value的另一层。

当我尝试查看显示了多少个第一唯一值时,我得到了:

输入:df[df['reviewerName'] == 'Charles G']

输出:

      reviewerName                               title   reviewerRatings
0          Charles    Harry Potter Book Seven News:...               3.0
19156      Charles    Harry Potter and the Half-Blo...               3.5
19156      Charles    Harry Potter and the Order of...               4.0

我希望操纵DataFrame,以便它看起来像这样作为输出:

           title                                reviewerRatings
Charles    Harry Potter Book Seven News:...     3.0
           Harry Potter and the Half-Blo...     3.5
           Harry Potter and the Order of...     4.0
Katherine  Harry Potter Boxed Set, Books...     5.0
           Harry Potter and the Half-Blo...     2.5
           Harry Potter and the Order of...     5.0

我尝试将三个列(reviewerName,title,reviewerRatings)分开,然后将这些项目串联在一起,但没有发现运气,如下所示:

输入:

p1 = df[['reviewerName']]
p2 = df[['title']]
p3 = df[['reviewerRatings']]
concatenated = pd.concat([p1,p2,p3], keys=list[p1.unqiue])
concatenated

输出:

AttributeError                            Traceback (most recent call last)
<ipython-input-106-5a6be8c1a3ba> in <module>()
----> 1 concatenated = pd.concat([p1,p2,p3], keys=list[p1.unqiue])
      2 concatenated

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
   4370             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   4371                 return self[name]
-> 4372             return object.__getattribute__(self, name)
   4373 
   4374     def __setattr__(self, name, value):

AttributeError: 'DataFrame' object has no attribute 'unqiue'

我也很幸运地研究了Pandas文档,不确定这里是否有人可以研究这个问题。

解决所需的输出后,我希望将嵌套的DataFrame转换为嵌套的Dictionary。

谢谢!

0 个答案:

没有答案