熊猫:如何通过拆分从一个多索引级别向一个多索引添加级别?

时间:2019-08-11 10:48:46

标签: python pandas

如何通过在const data = [ { "id": 1, "title": "Hello", "url": "http://localhost:8000/login/notes/1/", "description": "Hello nice", "created_at": "2019-08-10T06:02:55.468315Z", "created_by": "Dude", "items": [ { "id": 1, "url": "http://localhost:8000/login/items/1/", "title": "baby's toy", "note": "http://localhost:8000/login/notes/1/" }, { "id": 2, "url": "http://localhost:8000/login/items/2/", "title": "baby's toy", "note": "http://localhost:8000/login/notes/1/" }, { "id": 4, "url": "http://localhost:8000/login/items/4/", "title": "postman5", "note": "http://localhost:8000/login/notes/1/" } ] }, { "id": 2, "title": "abc", "url": "http://localhost:8000/login/notes/2/", "description": "asad", "created_at": "2019-08-10T15:23:53.074848Z", "created_by": "dude2", "items": [ { "id": 5, "url": "http://localhost:8000/login/items/5/", "title": "Parrot Toy", "note": "http://localhost:8000/login/notes/2/" } ] }] const result = data.map(el => { return `<h1>${el.title}</h1>` + el.items.map(el => `<h5>${el.title}</h5>`).join("") }).join("") console.log(result)拆分第二个级别来创建新级别?

初始索引:

enter image description here

@GetMapping(value = "/excelsheet")
@ResponseBody
downloadExcel(HttpServletResponse response) throws IOException {
     ...
     ByteArrayOutputStream out = departmentService.executeGridObjectListDemo();
     ...
     try {
            OutputStream outputStream = response.getOutputStream();
            out.writeTo(outputStream);
            outputStream.close();
            // second close can be put in a separate try-catch
            out.close();
     } catch (IOException e) {

     }
     ...
 }

所需的输出:

enter image description here

我尝试过的事情:

|

编辑: 我的方法无效的原因如下:

最初,为了使本示例更加简单,我将索引的级别1中的值用' | '分隔开,我删除了MultiIndex(levels=[['A', 'B', 'C', 'D'], ['a|a_unit', 'b|b_unit', 'c|c_unit']], codes=[[0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3], [0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2]]) 。没有开始,一切都会很好,但是从开始,我遇到了# plan was to create a new column and use set_index df.columns.to_frame().iloc[:,1].str.split('|') 错误:

*

有时候拥有适当的测试用例确实很棘手。

3 个答案:

答案 0 :(得分:1)

您可以尝试:

x

或者:

s=df.columns.to_frame().iloc[:,1].str.split('|')
final=(pd.DataFrame(data=df.values,columns=df.columns.get_level_values(0))
                   .T.set_index([s.str[0],s.str[1]],append=True).T)

enter image description here

答案 1 :(得分:1)

anky_91的答案非常紧凑。这是另一个与此索引配合使用的解决方案:

MultiIndex(levels=[['A', 'B', 'C', 'D'], ['a*|*a_unit', 'b*|*b_unit', 'c*|*c_unit']],
       codes=[[0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3], [0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2]])

    #  clean up the column index to have the same structure as before
    _split = [item.split('*|*') for item in df.columns.to_frame().values[:, 1]]
    _level_0 = df.columns.to_frame().values[:, 0].tolist()

    # get the old feature names (units still missing)
    idx_list = [(item[0], item[1][0], item[1][1]) for item in zip(_level_0, _split)]
    df_1.columns = pd.Index(idx_list)

为简单起见,我删除了*,但这样做却消除了我最初的方法的原因(请参阅anky:91的回答):df.columns.to_frame().iloc[:,1].str.split('|')无效

答案 2 :(得分:1)

另一种方法是使用index.get_level_values访问您的级别并将其分为三个索引:

idx1 = [idx.split('|')[0] for idx in df.index.get_level_values(1)]
idx2 = [idx.split('|')[1] for idx in df.index.get_level_values(1)]
df.index = [df.index.get_level_values(0), idx1, idx2]

输出

Empty DataFrame
Columns: []
Index: [(A, a, a_unit), (A, b, b_unit), (A, c, c_unit), (B, a, a_unit), (B, b, b_unit), (B, c, c_unit), (C, a, a_unit), (C, b, b_unit), (C, c, c_unit), (D, a, a_unit), (D, b, b_unit), (D, c, c_unit)]