我正在相对较小的pandas.DataFrame()对象(26行x 6列)上调用print(),这需要30秒钟以上的时间才能打印。
创建数据框只花了不到一秒钟的时间。据我所知,这在其他地方没有问过。有人知道这是怎么回事吗?
from PIL import Image
import numpy as np
import os
import pandas as pd
import time
dict = {}
counter = 0
start_time = time.time()
cam_dir_list = [item for item in os.listdir('.') if os.path.isdir(os.path.join('.',item))]
for cam_dir in cam_dir_list:
os.chdir('./'+cam_dir)
id_dir_list = [item for item in os.listdir('.') if os.path.isdir(os.path.join('.',item))]
for id_dir in id_dir_list:
os.chdir('./'+id_dir)
img_file_list = [item for item in os.listdir('.') if item.endswith('jpg')]
for img_file in img_file_list:
img = Image.open(img_file)
img_gs = img.convert('LA')
img_size = img.size
img_np = np.array(img)
img_gs_np = np.array(img_gs)
img_cam = cam_dir
img_id = id_dir
dict[counter] = [img_cam, img_id, img_file, img_size, img_np, img_gs_np]
counter+=1
os.chdir('..')
os.chdir('..')
stop_time = time.time()
tot = stop_time - start_time
print('dict time:', tot)
start_time = time.time()
df = pd.DataFrame.from_dict(dict, orient = 'index', columns = ['cam','id','file_name','pixel_size','rgb','gs'])
stop_time = time.time()
tot = stop_time - start_time
print('df time:', tot)
start_time = time.time()
print(df)
stop_time = time.time()
tot = stop_time - start_time
print('print time:', tot, '?????')
print('cam: (type): ', type(img_cam))
print('id: (type): ', type(img_id))
print('file_name (type): ', type(img_file))
print('pixel_size (type): ', type(img_size))
print('rgb (type,size): ', type(img_np), np.size(img_np))
print('gs (type,size): ', type(img_gs_np), np.size(img_gs_np))
输出如下所示:
dict time: 0.01134490966796875
df time: 0.0010509490966796875
cam ... gs
0 TestFolder1 ... [[[138, 255], [120, 255], [100, 255], [101, 25...
1 TestFolder1 ... [[[194, 255], [202, 255], [221, 255], [244, 25...
2 TestFolder1 ... [[[107, 255], [105, 255], [101, 255], [96, 255...
3 TestFolder1 ... [[[64, 255], [66, 255], [77, 255], [74, 255], ...
4 TestFolder1 ... [[[47, 255], [47, 255], [57, 255], [65, 255], ...
5 TestFolder1 ... [[[205, 255], [205, 255], [204, 255], [203, 25...
6 TestFolder2 ... [[[67, 255], [70, 255], [67, 255], [53, 255], ...
7 TestFolder2 ... [[[97, 255], [105, 255], [110, 255], [110, 255...
8 TestFolder2 ... [[[107, 255], [101, 255], [99, 255], [103, 255...
9 TestFolder2 ... [[[98, 255], [54, 255], [15, 255], [9, 255], [...
10 TestFolder2 ... [[[7, 255], [11, 255], [15, 255], [21, 255], [...
11 TestFolder2 ... [[[120, 255], [126, 255], [132, 255], [135, 25...
12 TestFolder2 ... [[[80, 255], [72, 255], [47, 255], [24, 255], ...
13 TestFolder2 ... [[[80, 255], [79, 255], [76, 255], [74, 255], ...
14 TestFolder2 ... [[[236, 255], [223, 255], [226, 255], [231, 25...
15 TestFolder2 ... [[[229, 255], [229, 255], [229, 255], [230, 25...
16 TestFolder2 ... [[[231, 255], [230, 255], [230, 255], [231, 25...
17 TestFolder2 ... [[[136, 255], [130, 255], [124, 255], [123, 25...
18 TestFolder2 ... [[[105, 255], [104, 255], [99, 255], [83, 255]...
19 TestFolder2 ... [[[21, 255], [23, 255], [25, 255], [24, 255], ...
20 TestFolder2 ... [[[13, 255], [12, 255], [13, 255], [13, 255], ...
21 TestFolder2 ... [[[226, 255], [228, 255], [228, 255], [229, 25...
22 TestFolder2 ... [[[217, 255], [218, 255], [218, 255], [218, 25...
23 TestFolder2 ... [[[63, 255], [63, 255], [69, 255], [77, 255], ...
24 TestFolder2 ... [[[205, 255], [228, 255], [229, 255], [220, 25...
25 TestFolder2 ... [[[66, 255], [72, 255], [84, 255], [90, 255], ...
[26 rows x 6 columns]
print time: 34.72059607505798 ?????
cam: (type): <class 'str'>
id: (type): <class 'str'>
file_name (type): <class 'str'>
pixel_size (type): <class 'tuple'>
rgb (type,size): <class 'numpy.ndarray'> 24576
gs (type,size): <class 'numpy.ndarray'> 16384
任何见识将不胜感激。谢谢。另外,元组的大小为2。