为什么DataFrame列对于json转换必须是唯一的?

时间:2017-10-15 14:14:13

标签: python pandas

合并两个数据帧后,我有这个数据帧:

ip                 accountname      name      gsm
192.168.1.1        aaaa             john doe  850
192.168.1.2        bbbb             jane doe  860

我想将数据帧转换为json:

json_df = df3.set_index('ip').T.to_json()

我有:

ValueError: DataFrame columns must be unique for orient='columns'.

IP在数据帧中是唯一的。索引列是唯一的。我怎样才能避免这个错误?任何建议将不胜感激。

2 个答案:

答案 0 :(得分:0)

如果df3.set_index('ip').index.is_unique为假,则意味着您已重复IP,可能是由于先前的合并。

如果您不关心重复项,则可以使用to_json(orient='records')

答案 1 :(得分:0)

我对同一件事感到困惑。正如我所说的那样,

df.index.is_unique

答案是“正确”。

但是,在检查了我的列名之后:

for col in df.columns: 
    print(col)

我知道了

Area
Month
2019
2020
2019
2020

换句话说,我有两列名为“ 2019”,两列名为“ 2020”。

换句话说:列名不是唯一的(这是错误消息对字母的实际含义……)。

在我的情况下,这是由于以下查询引起的:

select Area, 
       parking_month as Month, 
       SUM(CASE WHEN parking_year = 2019 THEN amount ELSE 0 END) as "2019",
       SUM(CASE WHEN parking_year = 2020 THEN amount ELSE 0 END) as "2020",
       SUM(CASE WHEN parking_year = 2019 THEN seconds ELSE 0 END) as "2019",
       SUM(CASE WHEN parking_year = 2020 THEN seconds ELSE 0 END) as "2020",
from some_parking_data

将此查询更改为:

select Area, 
       parking_month as Month, 
       SUM(CASE WHEN parking_year = 2019 THEN amount ELSE 0 END) as "A2019",
       SUM(CASE WHEN parking_year = 2020 THEN amount ELSE 0 END) as "A2020",
       SUM(CASE WHEN parking_year = 2019 THEN seconds ELSE 0 END) as "T2019",
       SUM(CASE WHEN parking_year = 2020 THEN seconds ELSE 0 END) as "T2020",
from some_parking_data

解决了这个问题。