我正在开发卷积神经网络。为此,我有一些图像数据,并且有关于此图像的标签。标签包含5到8个字符,大写字母从A到Z,数字从0到9。
标签看起来像这样:“ 7C24698”,“ 9B43104”等。
我使用以下代码阅读标签:
import csv
track_id = []
image_path = []
lp = []
train = []
# lists for different of label data
with open(r'path\to\labels') as csvDataFile:
csvReader = csv.reader(csvDataFile)
for row in csvReader:
track_id.append(row[0])
image_path.append(row[1])
lp.append(row[2])
train.append(row[3])
# pandas dataframe
import pandas as pd
df = pd.DataFrame(list(zip(track_id, image_path, lp, train)))
df_1 = df.columns = df.iloc[0]
df_2 = df.drop(df.index[0])
# pandas dataframe with labels
new_df = df_2.sort_values(by=['image_path', 'track_id'], ascending = [True, True])
# array with labels
y_train = new_df['lp'].to_numpy()
如何对每个标签进行热编码?我认为,我将获得37个字符的可能性,其中包含26个字母,9个数字和一个空格(由于标签的长度不同)和每个标签的数组。我该怎么办?
谢谢!