如何使用PIL从熊猫数据框中的行值创建图像?

时间:2020-10-03 20:05:33

标签: pandas numpy matplotlib python-imaging-library

这里是my data

这是一个36列的csv文件。 我打算将每一行都转换为图片,并将其存储为可提供给神经网络的数据库。

我已经看到并尝试使用PIL将一维numpy数组转换为图片,但是不知道如何在整个数据上实现它。

import pandas as pd
import numpy as np
from PIL import Image

dataframe = pd.read_csv('https://www.dropbox.com/s/sw2p9155zgmkkl5/df22.csv?dl=1',index_col=0)
dataframe

我创建了一个google colab以便于尝试。

数据

  • 前10行
HOME,WORK,SHOP,FREETIME,ACCOMPANY,FOOD,OTHER,AM,PM,MIDDAY,NIGHT,firsttrip_time,lasttrip_time,home_traveltime,work_traveltime,shop_traveltime,freetime_traveltime,accompany_traveltime,food_traveltime,home_traveldistance,work_traveldistance,shop_traveldistance,freetime_traveldistance,accompany_traveldistance,food_traveldistance,TRPMILES_mean,TRVL_MIN_mean,home_dweltime,work_dweltime,shop_dweltime,freetime_dweltime,accompany_dweltime,food_dweltime,AVG_VEH_CNT,TRPMILES_sum,TRVL_MIN_sum
2.0,0.0,0.0,1.0,0.0,0.0,2.0,1.0,2.0,2.0,0.0,9.0,20.0,32.5,0.0,0.0,2.0,0.0,0.0,0.72,0.0,0.0,0.01,0.0,0.0,0.58,25.4,115.0,0.0,0.0,118.0,0.0,0.0,1.0,84.22,127.0
2.0,0.0,0.0,3.0,2.0,0.0,1.0,1.0,5.0,2.0,0.0,9.0,20.0,32.5,0.0,0.0,10.0,2.5,0.0,0.72,0.0,0.0,0.26,0.01,0.0,0.37,16.88,115.0,0.0,0.0,51.67,12.5,0.0,1.0,85.22,135.0
2.0,2.0,0.0,0.0,0.0,0.0,1.0,2.0,1.0,1.0,1.0,9.0,20.0,11.5,8.5,0.0,0.0,0.0,0.0,0.19,0.12,0.0,0.0,0.0,0.0,0.14,9.4,46.0,243.0,0.0,0.0,0.0,0.0,1.0,21.0,47.0
1.0,0.0,2.0,0.0,0.0,1.0,0.0,0.0,2.0,2.0,0.0,13.0,16.0,20.0,0.0,17.5,0.0,0.0,10.0,0.17,0.0,0.07,0.0,0.0,0.03,0.09,16.25,0.0,0.0,50.0,0.0,0.0,20.0,1.0,10.0,65.0
1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,13.0,20.0,30.0,0.0,0.0,35.0,0.0,0.0,0.41,0.0,0.0,0.41,0.0,0.0,0.41,32.5,0.0,0.0,0.0,385.0,0.0,0.0,1.0,24.0,65.0
1.0,0.0,2.0,0.0,0.0,1.0,0.0,0.0,4.0,0.0,0.0,11.0,14.0,30.0,0.0,12.5,0.0,0.0,10.0,0.31,0.0,0.15,0.0,0.0,0.02,0.16,16.25,0.0,0.0,25.0,0.0,0.0,80.0,0.0,18.22,65.0
2.0,0.0,2.0,0.0,0.0,0.0,2.0,0.0,2.0,4.0,0.0,10.0,17.0,3.0,0.0,12.5,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,8.17,1.0,0.0,107.5,0.0,0.0,0.0,1.5,1.0,49.0
1.0,8.0,1.0,0.0,0.0,0.0,0.0,4.0,6.0,0.0,0.0,7.0,15.0,30.0,26.0,30.0,0.0,0.0,0.0,0.52,0.52,0.48,0.0,0.0,0.0,0.51,27.14,0.0,6.0,10.0,0.0,0.0,0.0,1.5,104.0,190.0
3.0,0.0,1.0,1.0,0.0,2.0,1.0,1.0,7.0,0.0,0.0,9.0,15.0,7.67,0.0,3.0,10.0,0.0,3.5,0.11,0.0,0.02,0.01,0.0,0.09,0.08,6.0,50.33,0.0,1.0,1.0,0.0,32.0,1.5,18.61,48.0
3.0,0.0,3.0,0.0,0.0,0.0,0.0,2.0,4.0,0.0,0.0,8.0,14.0,8.0,0.0,8.67,0.0,0.0,0.0,0.09,0.0,0.09,0.0,0.0,0.0,0.09,8.33,43.33,0.0,47.0,0.0,0.0,0.0,1.5,15.11,50.0

1 个答案:

答案 0 :(得分:1)

  • 使用how to convert a 1-dimensional image array to PIL image in Python应用于数据框
  • 图像适用于每一行。
    • 由于图像是矩形的,我们可以制作一个6 x 6的正方形,因为行的长度为36。
    • 这将生成214217个非常小的图像,因此可以使用.resize将其调整为所需的大小。
      • 调整所有图像的大小需要几分钟,具体取决于大小。
  • .applyaxis=1一起使用,以将该函数应用于数据帧中的每一行数据。
    • .values将行值x提取到形状为(36,)的numpy数组中,可以使用.reshape对其进行整形。
import pandas as pd
import numpy as np
from PIL import Image 

# create the dataframe
df = pd.read_csv('https://www.dropbox.com/s/sw2p9155zgmkkl5/df22.csv?dl=1', index_col=0)

# create images
images = df.apply(lambda x: Image.fromarray(x.values.reshape(6, 6), 'L').resize((200, 200)), axis=1)

# show image 0
images[0]
  • 下图代表df第一行的数据
df.iloc[0, :].values.reshape(6, 6)

array([[2.000e+00, 0.000e+00, 0.000e+00, 1.000e+00, 0.000e+00, 0.000e+00],
       [2.000e+00, 1.000e+00, 2.000e+00, 2.000e+00, 0.000e+00, 9.000e+00],
       [2.000e+01, 3.250e+01, 0.000e+00, 0.000e+00, 2.000e+00, 0.000e+00],
       [0.000e+00, 7.200e-01, 0.000e+00, 0.000e+00, 1.000e-02, 0.000e+00],
       [0.000e+00, 5.800e-01, 2.540e+01, 1.150e+02, 0.000e+00, 0.000e+00],
       [1.180e+02, 0.000e+00, 0.000e+00, 1.000e+00, 8.422e+01, 1.270e+02]])

enter image description here

  • 白色边框只是剪切和粘贴而来,不是图像的一部分。