这里是my data。
这是一个36列的csv文件。 我打算将每一行都转换为图片,并将其存储为可提供给神经网络的数据库。
我已经看到并尝试使用PIL
将一维numpy数组转换为图片,但是不知道如何在整个数据上实现它。
import pandas as pd
import numpy as np
from PIL import Image
dataframe = pd.read_csv('https://www.dropbox.com/s/sw2p9155zgmkkl5/df22.csv?dl=1',index_col=0)
dataframe
我创建了一个google colab以便于尝试。
HOME,WORK,SHOP,FREETIME,ACCOMPANY,FOOD,OTHER,AM,PM,MIDDAY,NIGHT,firsttrip_time,lasttrip_time,home_traveltime,work_traveltime,shop_traveltime,freetime_traveltime,accompany_traveltime,food_traveltime,home_traveldistance,work_traveldistance,shop_traveldistance,freetime_traveldistance,accompany_traveldistance,food_traveldistance,TRPMILES_mean,TRVL_MIN_mean,home_dweltime,work_dweltime,shop_dweltime,freetime_dweltime,accompany_dweltime,food_dweltime,AVG_VEH_CNT,TRPMILES_sum,TRVL_MIN_sum
2.0,0.0,0.0,1.0,0.0,0.0,2.0,1.0,2.0,2.0,0.0,9.0,20.0,32.5,0.0,0.0,2.0,0.0,0.0,0.72,0.0,0.0,0.01,0.0,0.0,0.58,25.4,115.0,0.0,0.0,118.0,0.0,0.0,1.0,84.22,127.0
2.0,0.0,0.0,3.0,2.0,0.0,1.0,1.0,5.0,2.0,0.0,9.0,20.0,32.5,0.0,0.0,10.0,2.5,0.0,0.72,0.0,0.0,0.26,0.01,0.0,0.37,16.88,115.0,0.0,0.0,51.67,12.5,0.0,1.0,85.22,135.0
2.0,2.0,0.0,0.0,0.0,0.0,1.0,2.0,1.0,1.0,1.0,9.0,20.0,11.5,8.5,0.0,0.0,0.0,0.0,0.19,0.12,0.0,0.0,0.0,0.0,0.14,9.4,46.0,243.0,0.0,0.0,0.0,0.0,1.0,21.0,47.0
1.0,0.0,2.0,0.0,0.0,1.0,0.0,0.0,2.0,2.0,0.0,13.0,16.0,20.0,0.0,17.5,0.0,0.0,10.0,0.17,0.0,0.07,0.0,0.0,0.03,0.09,16.25,0.0,0.0,50.0,0.0,0.0,20.0,1.0,10.0,65.0
1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,13.0,20.0,30.0,0.0,0.0,35.0,0.0,0.0,0.41,0.0,0.0,0.41,0.0,0.0,0.41,32.5,0.0,0.0,0.0,385.0,0.0,0.0,1.0,24.0,65.0
1.0,0.0,2.0,0.0,0.0,1.0,0.0,0.0,4.0,0.0,0.0,11.0,14.0,30.0,0.0,12.5,0.0,0.0,10.0,0.31,0.0,0.15,0.0,0.0,0.02,0.16,16.25,0.0,0.0,25.0,0.0,0.0,80.0,0.0,18.22,65.0
2.0,0.0,2.0,0.0,0.0,0.0,2.0,0.0,2.0,4.0,0.0,10.0,17.0,3.0,0.0,12.5,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,8.17,1.0,0.0,107.5,0.0,0.0,0.0,1.5,1.0,49.0
1.0,8.0,1.0,0.0,0.0,0.0,0.0,4.0,6.0,0.0,0.0,7.0,15.0,30.0,26.0,30.0,0.0,0.0,0.0,0.52,0.52,0.48,0.0,0.0,0.0,0.51,27.14,0.0,6.0,10.0,0.0,0.0,0.0,1.5,104.0,190.0
3.0,0.0,1.0,1.0,0.0,2.0,1.0,1.0,7.0,0.0,0.0,9.0,15.0,7.67,0.0,3.0,10.0,0.0,3.5,0.11,0.0,0.02,0.01,0.0,0.09,0.08,6.0,50.33,0.0,1.0,1.0,0.0,32.0,1.5,18.61,48.0
3.0,0.0,3.0,0.0,0.0,0.0,0.0,2.0,4.0,0.0,0.0,8.0,14.0,8.0,0.0,8.67,0.0,0.0,0.0,0.09,0.0,0.09,0.0,0.0,0.0,0.09,8.33,43.33,0.0,47.0,0.0,0.0,0.0,1.5,15.11,50.0
答案 0 :(得分:1)
6 x 6
的正方形,因为行的长度为36。.resize
将其调整为所需的大小。
.apply
与axis=1
一起使用,以将该函数应用于数据帧中的每一行数据。
.values
将行值x
提取到形状为(36,)
的numpy数组中,可以使用.reshape
对其进行整形。import pandas as pd
import numpy as np
from PIL import Image
# create the dataframe
df = pd.read_csv('https://www.dropbox.com/s/sw2p9155zgmkkl5/df22.csv?dl=1', index_col=0)
# create images
images = df.apply(lambda x: Image.fromarray(x.values.reshape(6, 6), 'L').resize((200, 200)), axis=1)
# show image 0
images[0]
df
第一行的数据df.iloc[0, :].values.reshape(6, 6)
array([[2.000e+00, 0.000e+00, 0.000e+00, 1.000e+00, 0.000e+00, 0.000e+00],
[2.000e+00, 1.000e+00, 2.000e+00, 2.000e+00, 0.000e+00, 9.000e+00],
[2.000e+01, 3.250e+01, 0.000e+00, 0.000e+00, 2.000e+00, 0.000e+00],
[0.000e+00, 7.200e-01, 0.000e+00, 0.000e+00, 1.000e-02, 0.000e+00],
[0.000e+00, 5.800e-01, 2.540e+01, 1.150e+02, 0.000e+00, 0.000e+00],
[1.180e+02, 0.000e+00, 0.000e+00, 1.000e+00, 8.422e+01, 1.270e+02]])