我尝试使用我的自定义数据集训练模型(deepLabV3)。该模型使用彩色图像标签,因此我更改了类的rgb值。出于某种原因,当我使用自己的数据集时,我看到损耗曲线具有奇怪的行为(峰值)。出于同样的原因(?) 当我绘制验证集的基本事实时,我得到了这个
代替此:
实际上是我的数据集中给定的。
在代码中,有准备数据的这一步骤:
gt = load_image(val_output_names[ind])[:args.crop_height, :args.crop_width]
gt = helpers.reverse_one_hot(helpers.one_hot_it(gt, label_values))
gt = helpers.colour_code_segmentation(gt, label_values)
其中:
def get_label_info(csv_path):
"""
Retrieve the class names and label values for the selected dataset.
Must be in CSV format!
# Arguments
csv_path: The file path of the class dictionairy
# Returns
Two lists: one for the class names and the other for the label values
"""
filename, file_extension = os.path.splitext(csv_path)
if not file_extension == ".csv":
return ValueError("File is not a CSV!")
class_names = []
label_values = []
with open(csv_path, 'r') as csvfile:
file_reader = csv.reader(csvfile, delimiter=',')
header = next(file_reader)
for row in file_reader:
class_names.append(row[0])
label_values.append([int(row[1]), int(row[2]), int(row[3])])
# print(class_dict)
return class_names, label_values
def one_hot_it(label, label_values):
"""
Convert a segmentation image label array to one-hot format
by replacing each pixel value with a vector of length num_classes
# Arguments
label: The 2D array segmentation image label
label_values
# Returns
A 2D array with the same width and hieght as the input, but
with a depth size of num_classes
"""
# st = time.time()
# w = label.shape[0]
# h = label.shape[1]
# num_classes = len(class_dict)
# x = np.zeros([w,h,num_classes])
# unique_labels = sortedlist((class_dict.values()))
# for i in range(0, w):
# for j in range(0, h):
# index = unique_labels.index(list(label[i][j][:]))
# x[i,j,index]=1
# print("Time 1 = ", time.time() - st)
# st = time.time()
# https://stackoverflow.com/questions/46903885/map-rgb-semantic-maps-to-one-hot-encodings-and-vice-versa-in-tensorflow
# https://stackoverflow.com/questions/14859458/how-to-check-if-all-values-in-the-columns-of-a-numpy-matrix-are-the-same
semantic_map = []
for colour in label_values:
# colour_map = np.full((label.shape[0], label.shape[1], label.shape[2]), colour, dtype=int)
equality = np.equal(label, colour)
class_map = np.all(equality, axis = -1)
semantic_map.append(class_map)
semantic_map = np.stack(semantic_map, axis=-1)
# print("Time 2 = ", time.time() - st)
return semantic_map
def reverse_one_hot(image):
"""
Transform a 2D array in one-hot format (depth is num_classes),
to a 2D array with only 1 channel, where each pixel value is
the classified class key.
# Arguments
image: The one-hot format image
# Returns
A 2D array with the same width and hieght as the input, but
with a depth size of 1, where each pixel value is the classified
class key.
"""
# w = image.shape[0]
# h = image.shape[1]
# x = np.zeros([w,h,1])
# for i in range(0, w):
# for j in range(0, h):
# index, value = max(enumerate(image[i, j, :]), key=operator.itemgetter(1))
# x[i, j] = index
x = np.argmax(image, axis = -1)
return x
def colour_code_segmentation(image, label_values):
"""
Given a 1-channel array of class keys, colour code the segmentation results.
# Arguments
image: single channel array where each value represents the class key.
label_values
# Returns
Colour coded image for segmentation visualization
"""
# w = image.shape[0]
# h = image.shape[1]
# x = np.zeros([w,h,3])
# colour_codes = label_values
# for i in range(0, w):
# for j in range(0, h):
# x[i, j, :] = colour_codes[int(image[i, j])]
colour_codes = np.array(label_values)
x = colour_codes[image.astype(int)]
return x
我使用GTA数据集和标签颜色:
name,r,g,b
static, 20, 20, 20
dynamic, 111, 74, 0
ground, 81, 0, 81
road,128,64,128
sidewalk,244,35,232
parking, 250, 170, 160
rail_track, 230, 150, 140
building,70, 70, 70
wall,102, 102, 156
fence,190, 153, 153
guard rail, 180, 165, 180
bridge, 150, 100, 100
tunnel, 150, 120, 90
pole,153, 153, 153
polegroup, 153, 153, 153
traffic_light,250, 170, 30
traffic_sign,220, 220, 0
vegetation,107, 142, 35
terrain,152, 251, 152
sky,70, 130, 180
person,220, 20, 60
rider,255, 0, 0
car,0, 0, 142
truck,0, 0, 70
bus,0, 60, 100
caravan, 0, 0, 90
trailer, 0, 0, 110
train,0, 80, 100
motorcycle,0, 0, 230
bicycle,119, 11, 32
license_plate,0, 0, 142
void,0, 0, 0
,255,255,255
就类标签而言,等于标签值。
修改标签颜色时(例如,作为第一类标签人行道放置),我得到以下信息:
因此,据我所知,出于某些原因(?),csv文件中的第一个标签颜色为类之间的边界着色。相反,对于Pascal数据集,所有内容都像一个超级按钮。