是否有一种方法可以在Jupyter实验室笔记本中的第n行重复熊猫数据框的标题(列标题)。
我通常在Jupyter实验室笔记本中对熊猫使用pd.set_option('display.max_columns', None)
,因为我想查看所有列和数据。但是,如果显示10行以上,则在查看标题时将隐藏底部的滚动条,而在查看滚动条时将隐藏标题。这使得知道您所看到的内容或滚动至何处变得非常困难。
是否有一种方法可以在每第n行重复标题,或者是否可以具有垂直滚动条,并且标题始终保持可见状态。
答案 0 :(得分:1)
简单的print()
可以满足您的需求。您可以尝试使用此代码。
import numpy as np
import pandas as pd
# generate a dataframe with many rows, say, 40.
randn = np.random.randn
df1 = pd.DataFrame(randn(40, 4).round(2), columns=['A', 'B', 'C', 'D'])
start = 0
for ea in range(0, len(df1), 10):
# will print 10 rows of data in each loop
if (start != ea):
print("From row:%s" % start, "to row:%s" % (ea-1))
# print certain rows of the dataframe
print(df1.iloc[start:ea])
print()
start = ea
输出:
From row:0 to row:9
A B C D
0 0.57 0.90 -0.74 -0.82
1 0.46 1.44 -1.42 0.90
2 1.08 0.18 1.73 -0.64
3 -2.32 -0.89 0.62 0.35
4 0.19 0.51 -0.79 -0.37
5 -0.41 0.78 0.12 -1.88
6 0.53 -0.60 -0.29 -1.45
7 1.54 0.01 0.12 0.72
8 -1.65 0.36 -2.61 1.81
9 0.23 -1.23 0.46 1.17
From row:10 to row:19
A B C D
10 1.02 -1.14 -2.11 0.69
11 1.30 0.27 1.80 0.39
12 0.43 0.70 0.23 -0.84
13 -0.14 -1.29 0.31 0.34
14 1.94 0.16 -0.86 1.19
15 -0.43 -2.05 1.69 -0.98
16 -0.54 -0.59 -0.70 -0.29
17 1.34 -0.04 -1.02 -0.19
18 1.47 -0.53 1.09 1.15
19 -0.04 1.13 -1.27 -1.09
From row:20 to row:29
A B C D
20 -0.16 1.39 0.35 -0.16
21 0.79 0.12 -1.22 -0.55
22 -1.16 -0.29 0.14 0.33
23 1.59 -0.26 -0.01 1.07
24 -0.76 -2.46 0.08 0.35
25 0.29 2.07 -0.96 0.63
26 0.85 -1.08 1.19 1.71
27 -0.36 0.00 0.87 -0.50
28 0.07 0.84 0.80 0.00
29 -0.16 -0.43 1.51 -1.24
编辑1
另一种方法:
import numpy as np
import pandas as pd
from IPython.display import HTML
from random import choice
from string import ascii_uppercase
# Generate data for demo
rows = 36
cols = 6
def rand_colhdr(num=3):
return ''.join(choice(ascii_uppercase) for i in range(num))
col_hdrs = [rand_colhdr(num=3) for ix in range(cols)]
randn = np.random.randn
df1 = pd.DataFrame(randn(rows, cols).round(2), columns=col_hdrs)
# Code to gen a table as HTML, taking data from pandas dataframe
rows = len(df1) #take all rows
cols = len(df1.columns) #take all columns
# Every `rowgrp` section need its own header
rowgrp = 8
col_hdrs = [acol for acol in df1.columns]
col_hdrs = ["row#"]+col_hdrs
div_tplt = "<div><table>{0}</table></div>"
str2rep = "<tr>"+''.join(["<th>"+str(ea)+"</th>" for ea in col_hdrs])+"</tr>"
#str2rep = head_row
for arow in range(rows):
td_code = ""
th_code = ""
if (arow%rowgrp == 0 and arow > 0):
# do header rows
for acol in range(1+len(df1.columns)):
# use columns' name from dataframe
#th_code += '<td>'+ '<b>Col:'+str(acol) +'</b></td>'+"\n"
th_code += '<td>'+ '<b>'+ col_hdrs[acol] +'</b></td>'+"\n"
row_tp = "<tr>{0}</tr>".format(th_code)
str2rep += row_tp.format()
for acol in range(1+len(df1.columns)):
# do regular row data
if acol==0: #do row-index
td_code += '<td>'+str(arow+1)+'</td>'+"\n"
pass
else:
# use the content of cell(arow,acol) from dataframe
td_code += '<td>'+ repr(df1.iloc[arow, acol-1]) +'</td>'+"\n"
row_tp = "<tr>{0}</tr>".format(td_code)
str2rep += row_tp.format()
HTML(div_tplt.format(str2rep))