我有一个字符串:
'0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,73-100,100-51,51,51,51-100,100-52,52,52,52,52,52,52,52,52-100,100-71,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0'
我基本上想将具有上述字符串列的数据帧馈送到一维CNN进行二进制分类,因此我需要在训练模型之前将它们转换为numpy数组。
我如何将这些字符串转换为numpy数组并考虑一些数字之间的字符"-"
保存其功能?
答案 0 :(得分:1)
import numpy as np
inp = "0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,73-100,100-51,51,51,51-100,100-52,52,52,52,52,52,52,52,52-100,100-71,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0"
arr = np.array(inp.split(","))
如果您希望将它们作为数字使用,请使用dtype=np.uint8
,但必须使用-
对数字进行预处理(使用replace()
等。 )
答案 1 :(得分:1)
这可以接受吗?
我正在使用否定代码来确保它们不会与您的任何位置代码发生冲突。你有主意:
locations = '0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,73-100,100-51,51,51,51-100,100-52,52,52,52,52,52,52,52,52-100,100-71,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0'
import numpy as np
code = 0
mappings = {}
mapped_locations = []
for location in locations.split(','):
if '-' in location:
parts = [int(part) for part in location.split('-')]
small, large = min(parts), max(parts)
key = f'{small}-{large}'
if key not in mappings:
code -= 1
mappings[key] = code
mapped_locations.append(mappings[key])
else:
mapped_locations.append(int(location))
print(np.array(mapped_locations))
print()
print(mappings)
输出:
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 -1 -2 51 51 -2 -3 52 52 52 52 52 52 52 -3 -4 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
{'73-100': -1, '51-100': -2, '52-100': -3, '71-100': -4}