我目前正在研究SVHN数据。
我对一件事感到很困惑:
为什么标签数据只包含一位数?
由于门牌号可以是几位数(例如123或4000等),标签也不应该是多位数,或10xn一个热矢量? (10个分类,n =图像中的数字位数?)
(我相信问题的答案应该很容易......但是,我只是被困了几天......)
答案 0 :(得分:1)
The reason is this:
"Each element in digitStruct has the following fields: name which is a string containing the filename of the corresponding image. bbox which is a struct array that contains the position, size and label of each digit bounding box in the image" - link
So basically each row represents one digit with boxes on an image then you have multiple rows if an image contains multiple digits.