熊猫数据框:我想将两个具有相同值的单元格合并为一个

时间:2020-04-14 08:56:46

标签: python pandas dataframe

我的Excel工作表如下所示:- enter image description here

我需要使用以下格式的输出:

enter image description here

这是我用来生成excel sheel的代码:-

[
    {
        "id": "5e9ec90d9e2ba22996168d78",
        "description": "Notificaciones a Arcgis",
        "expires": "2040-01-01T14:00:00.00Z",
        "status": "active",
        "subject": {
            "entities": [
                {
                    "id": "id1",
                    "type": "aqss"
                }
            ],
            "condition": {
                "attrs": []
            }
        },
        "notification": {
            "timesSent": 521026,
            "lastNotification": "2020-05-21T10:52:53.00Z",
            "attrs": [],
            "attrsFormat": "normalized",
            "http": {
                "url": "http://test.eu/notify/"
            },
            "metadata": [
                "dateCreated",
                "dateModified"
            ],
            "lastFailure": "2020-05-20T16:46:58.00Z",
            "lastFailureReason": "Timeout was reached",
            "lastSuccess": "2020-05-21T10:52:53.00Z",
            "lastSuccessCode": 500
        }
    },
    {
        "id": "5e9ec96f9e2ba22996168d79",
        "description": "Notificaciones a Arcgis",
        "expires": "2040-01-01T14:00:00.00Z",
        "status": "active",
        "subject": {
            "entities": [
                {
                    "id": "id2",
                    "type": "aqss"
                }
            ],
            "condition": {
                "attrs": []
            }
        },
        "notification": {
            "timesSent": 526757,
            "lastNotification": "2020-05-21T10:52:50.00Z",
            "attrs": [],
            "attrsFormat": "normalized",
            "http": {
                "url": "http://test.eu/notify/"
            },
            "metadata": [
                "dateCreated",
                "dateModified"
            ],
            "lastFailure": "2020-05-21T09:32:34.00Z",
            "lastFailureReason": "Timeout was reached",
            "lastSuccess": "2020-05-21T10:52:50.00Z",
            "lastSuccessCode": 500
        }
    },
    {
        "id": "5e9ec9899e2ba22996168d7a",
        "description": "Notificaciones a Arcgis",
        "expires": "2040-01-01T14:00:00.00Z",
        "status": "active",
        "subject": {
            "entities": [
                {
                    "id": "id3",
                    "type": "aqss"
                }
            ],
            "condition": {
                "attrs": []
            }
        },
        "notification": {
            "timesSent": 541814,
            "lastNotification": "2020-05-21T10:52:47.00Z",
            "attrs": [],
            "attrsFormat": "normalized",
            "http": {
                "url": "http://test.eu/notify/"
            },
            "metadata": [
                "dateCreated",
                "dateModified"
            ],
            "lastFailure": "2020-05-21T01:49:52.00Z",
            "lastFailureReason": "Timeout was reached",
            "lastSuccess": "2020-05-21T10:52:48.00Z",
            "lastSuccessCode": 500
        }
    },
    {
        "id": "5e9ec9a69e2ba22996168d7b",
        "description": "Notificaciones a Arcgis",
        "expires": "2040-01-01T14:00:00.00Z",
        "status": "active",
        "subject": {
            "entities": [
                {
                    "id": "id3",
                    "type": "aqss"
                }
            ],
            "condition": {
                "attrs": []
            }
        },
        "notification": {
            "timesSent": 470859,
            "lastNotification": "2020-05-21T10:52:47.00Z",
            "attrs": [],
            "attrsFormat": "normalized",
            "http": {
                "url": "http://test.eu/notify/"
            },
            "metadata": [
                "dateCreated",
                "dateModified"
            ],
            "lastFailure": "2020-05-20T21:01:32.00Z",
            "lastFailureReason": "Timeout was reached",
            "lastSuccess": "2020-05-21T10:52:47.00Z",
            "lastSuccessCode": 500
        }
    },
    {
        "id": "5e9ec9c09e2ba22996168d7c",
        "description": "Notificaciones a Arcgis",
        "expires": "2040-01-01T14:00:00.00Z",
        "status": "active",
        "subject": {
            "entities": [
                {
                    "id": "id4",
                    "type": "aqss"
                }
            ],
            "condition": {
                "attrs": []
            }
        },
        "notification": {
            "timesSent": 532901,
            "lastNotification": "2020-05-21T10:52:44.00Z",
            "attrs": [],
            "attrsFormat": "normalized",
            "http": {
                "url": "http://test.eu/notify/"
            },
            "metadata": [
                "dateCreated",
                "dateModified"
            ],
            "lastFailure": "2020-05-21T08:33:10.00Z",
            "lastFailureReason": "Timeout was reached",
            "lastSuccess": "2020-05-21T10:52:45.00Z",
            "lastSuccessCode": 500
        }
    },
    {
        "id": "5e9ec9ff9e2ba22996168d7d",
        "description": "Notificaciones a Arcgis",
        "expires": "2040-01-01T14:00:00.00Z",
        "status": "active",
        "subject": {
            "entities": [
                {
                    "id": "id5",
                    "type": "aqss"
                }
            ],
            "condition": {
                "attrs": []
            }
        },
        "notification": {
            "timesSent": 520974,
            "lastNotification": "2020-05-21T10:52:53.00Z",
            "attrs": [],
            "attrsFormat": "normalized",
            "http": {
                "url": "http://test.eu/notify/"
            },
            "metadata": [
                "dateCreated",
                "dateModified"
            ],
            "lastFailure": "2020-05-21T05:30:58.00Z",
            "lastFailureReason": "Timeout was reached",
            "lastSuccess": "2020-05-21T10:52:53.00Z",
            "lastSuccessCode": 500
        }
    }
]

请帮助我。

谢谢。

1 个答案:

答案 0 :(得分:2)

这是一个工作示例,其中重复的Data列单元格中有空字符串:

df = pd.DataFrame({'Data': [10, 10, 20, 30, 20, 15, 30, 45], 'Value': [1,2,3,4,5,6,7,8]})
duplicated_data = df.Data.duplicated(keep='last')
df.Data = df.Data.where(~duplicated_data, '')
df

这对您的情况有用吗?