当尝试使用一个热编码器为以下类别实现编码时,出现了couldn't convert string to float
错误。
['0-17', '55+', '26-35', '46-50', '51-55', '36-45', '18-25']
答案 0 :(得分:0)
I made something real quick that should work. You will see that I had a really nasty looking one-liner for preconditioning your limits; however, it will be much easier if you just convert the limits directly to the proper format.
Essentially, this just iterates through a list of limits and makes comparisons to the limits. If the sample of data is less than the limit, we make that index a 1 and break.
import random
# str_limits = ['0-17', '55+', '26-35', '46-50', '51-55', '36-45', '18-25']
#
# oneline conditioning for the limit string format
# limits = sorted(list(filter(lambda x: not x.endswith("+"), map(lambda v: v.split("-")[-1], str_limits))))
# limits.append('1000')
# do this instead
limits = sorted([17, 35, 50, 55, 45, 25, 1000])
# sample 100 random datapoints between 0 and 65 for testing
samples = [random.choice(list(range(65))) for i in range(100)]
onehot = [] # this is where we will store our one-hot encodings
for sample in samples:
row = [0]*len(limits) # preallocating a list
for i, limit in enumerate(limits):
if sample <= limit:
row[i] = 1
break
# storing that sample's onehot into a onehot list of lists
onehot.append(row)
for i in range(10):
print("{}: {}".format(onehot[i], samples[i]))
I am not sure about the specifics of your implementation, but you are probably forgetting to convert from a string to an integer at some point.