pandas DataFrame只打印一次索引值

时间:2018-04-13 09:28:50

标签: python pandas dataframe

import pandas as pd
li = [{"employee_id":1,"project_handled": "pas"},{"employee_id":1,"project_handled": "asap"},{"employee_id":2,"project_handled": "trimm"},{"employee_id":2,"project_handled": "fat"}]
df = pd.DataFrame(li)
df.set_index("employee_id",inplace=True)
print(df)

给出:

            project_handled
employee_id                
1                       pas
1                      asap
2                     trimm
2                       fat

我想要的是,打印时不应重复索引值:

            project_handled
employee_id                
1                       pas
                       asap
2                     trimm
                        fat

我想将其序列化并使用DataFrame.to_excel api以excel的形式共享。而要求是索引不应该在employee_id列中重复。

2 个答案:

答案 0 :(得分:1)

您需要设置MultiIndex

import pandas as pd
li = [{"employee_id":1,"project_handled": "pas"},{"employee_id":1,"project_handled": "asap"},{"employee_id":2,"project_handled": "trimm"},{"employee_id":2,"project_handled": "fat"}]
df = pd.DataFrame(li)
df['Something'] = 1
df.set_index(["employee_id", "project_handled"],inplace=True)
print(df)

我已添加Something,因为否则你会得到:

Empty DataFrame
Columns: []
Index: [(1, pas), (1, asap), (2, trimm), (2, fat)]

修改

要在没有project_handled的情况下创建它,您需要空列和MultiIndex

df["another"] = ""
df.set_index(["employee_id", "another"],inplace=True)

答案 1 :(得分:0)

如果您的唯一目标是以所需的方式将DataFrame打印到csv,并且每个employee_id值不需要只有一个单元格,那么您可以执行以下操作:

import pandas as pd

li = [{"employee_id":1,"project_handled": "pas"},{"employee_id":1,"project_handled": "asap"},{"employee_id":2,"project_handled": "trimm"},{"employee_id":2,"project_handled": "fat"}]
df = pd.DataFrame(li)

def custom_func(x): 
    for i in range(1, x['employee_id'].size):
            x['employee_id'].iloc[i] = ''
    return x;

df['employee_id'] = df['employee_id'].apply(str)
df = df.groupby('employee_id').apply(custom_func).set_index('employee_id')
print(df)

输出:

            project_handled
employee_id
1                       pas
                       asap
2                     trimm
                        fat

df.to_csv('test.csv')的结果如下:

enter image description here