Question

我有一个字符串：

'0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,73-100,100-51,51,51,51-100,100-52,52,52,52,52,52,52,52,52-100,100-71,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0'

我基本上想将具有上述字符串列的数据帧馈送到一维CNN进行二进制分类，因此我需要在训练模型之前将它们转换为numpy数组。

我如何将这些字符串转换为numpy数组并考虑一些数字之间的字符"-"保存其功能？

Answer 1

import numpy as np

inp = "0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,73-100,100-51,51,51,51-100,100-52,52,52,52,52,52,52,52,52-100,100-71,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0"
arr = np.array(inp.split(","))

如果您希望将它们作为数字使用，请使用dtype=np.uint8，但必须使用-对数字进行预处理（使用replace()等。）

Answer 2

这可以接受吗？

我正在使用否定代码来确保它们不会与您的任何位置代码发生冲突。你有主意：

locations = '0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,73-100,100-51,51,51,51-100,100-52,52,52,52,52,52,52,52,52-100,100-71,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0'

import numpy as np
code = 0
mappings = {}
mapped_locations = []
for location in locations.split(','):
    if '-' in location:
        parts = [int(part) for part in location.split('-')]
        small, large = min(parts), max(parts)
        key = f'{small}-{large}'
        if key not in mappings:
            code -= 1
            mappings[key] = code
        mapped_locations.append(mappings[key])
    else:
        mapped_locations.append(int(location))
print(np.array(mapped_locations))
print()
print(mappings)

输出：

[ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0 -1 -2 51 51 -2 -3 52 52 52 52 52 52 52 -3 -4  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]

{'73-100': -1, '51-100': -2, '52-100': -3, '71-100': -4}

将字符串转换为numpy数组

2 个答案: