Question

是否有一种方法可以在Jupyter实验室笔记本中的第n行重复熊猫数据框的标题（列标题）。

我通常在Jupyter实验室笔记本中对熊猫使用pd.set_option('display.max_columns', None)，因为我想查看所有列和数据。但是，如果显示10行以上，则在查看标题时将隐藏底部的滚动条，而在查看滚动条时将隐藏标题。这使得知道您所看到的内容或滚动至何处变得非常困难。

是否有一种方法可以在每第n行重复标题，或者是否可以具有垂直滚动条，并且标题始终保持可见状态。

Answer 1

简单的print()可以满足您的需求。您可以尝试使用此代码。

import numpy as np
import pandas as pd

# generate a dataframe with many rows, say, 40.
randn = np.random.randn
df1 = pd.DataFrame(randn(40, 4).round(2), columns=['A', 'B', 'C', 'D'])

start = 0
for ea in range(0, len(df1), 10):
    # will print 10 rows of data in each loop
    if (start != ea):
        print("From row:%s" % start, "to row:%s" % (ea-1))
        # print certain rows of the dataframe
        print(df1.iloc[start:ea])
        print()
    start = ea

输出：

From row:0 to row:9
      A     B     C     D
0  0.57  0.90 -0.74 -0.82
1  0.46  1.44 -1.42  0.90
2  1.08  0.18  1.73 -0.64
3 -2.32 -0.89  0.62  0.35
4  0.19  0.51 -0.79 -0.37
5 -0.41  0.78  0.12 -1.88
6  0.53 -0.60 -0.29 -1.45
7  1.54  0.01  0.12  0.72
8 -1.65  0.36 -2.61  1.81
9  0.23 -1.23  0.46  1.17

From row:10 to row:19
       A     B     C     D
10  1.02 -1.14 -2.11  0.69
11  1.30  0.27  1.80  0.39
12  0.43  0.70  0.23 -0.84
13 -0.14 -1.29  0.31  0.34
14  1.94  0.16 -0.86  1.19
15 -0.43 -2.05  1.69 -0.98
16 -0.54 -0.59 -0.70 -0.29
17  1.34 -0.04 -1.02 -0.19
18  1.47 -0.53  1.09  1.15
19 -0.04  1.13 -1.27 -1.09

From row:20 to row:29
       A     B     C     D
20 -0.16  1.39  0.35 -0.16
21  0.79  0.12 -1.22 -0.55
22 -1.16 -0.29  0.14  0.33
23  1.59 -0.26 -0.01  1.07
24 -0.76 -2.46  0.08  0.35
25  0.29  2.07 -0.96  0.63
26  0.85 -1.08  1.19  1.71
27 -0.36  0.00  0.87 -0.50
28  0.07  0.84  0.80  0.00
29 -0.16 -0.43  1.51 -1.24

编辑1

另一种方法：

import numpy as np
import pandas as pd
from IPython.display import HTML
from random import choice
from string import ascii_uppercase

# Generate data for demo
rows = 36
cols = 6

def rand_colhdr(num=3):
    return ''.join(choice(ascii_uppercase) for i in range(num))

col_hdrs = [rand_colhdr(num=3) for ix in range(cols)]

randn = np.random.randn
df1 = pd.DataFrame(randn(rows, cols).round(2), columns=col_hdrs)

# Code to gen a table as HTML, taking data from pandas dataframe 
rows = len(df1)         #take all rows
cols = len(df1.columns) #take all columns

# Every `rowgrp` section need its own header
rowgrp = 8
col_hdrs = [acol for acol in df1.columns]
col_hdrs = ["row#"]+col_hdrs

div_tplt = "<div><table>{0}</table></div>"
str2rep = "<tr>"+''.join(["<th>"+str(ea)+"</th>" for ea in col_hdrs])+"</tr>"

#str2rep = head_row
for arow in range(rows):
    td_code = ""

    th_code = ""
    if (arow%rowgrp == 0 and arow > 0):
        # do header rows
        for acol in range(1+len(df1.columns)):
            # use columns' name from dataframe

            #th_code += '<td>'+ '<b>Col:'+str(acol) +'</b></td>'+"\n"
            th_code += '<td>'+ '<b>'+ col_hdrs[acol] +'</b></td>'+"\n"
        row_tp = "<tr>{0}</tr>".format(th_code)
        str2rep += row_tp.format()

    for acol in range(1+len(df1.columns)):
        # do regular row data
        if acol==0: #do row-index
            td_code += '<td>'+str(arow+1)+'</td>'+"\n"
            pass
        else:
            # use the content of cell(arow,acol) from dataframe
            td_code += '<td>'+ repr(df1.iloc[arow, acol-1]) +'</td>'+"\n"

    row_tp = "<tr>{0}</tr>".format(td_code)
    str2rep += row_tp.format()

HTML(div_tplt.format(str2rep))

熊猫-在Jupyter-Lab笔记本中每第n行重复标题

1 个答案: