Question

我是Python的新手，有没有可以对数据进行规范化的函数？

例如，我在范围0 - 1示例中设置了一系列列表：[0.92323, 0.7232322, 0,93832, 0.4344433]

我想将所有值标准化为范围0.25 - 0.50

谢谢，

Answer 1

你可以沿着以下几点做：

>>> l = [0.92323, 0.7232322, 0.93832, 0.4344433]
>>> lower, upper = 0.25, 0.5
>>> l_norm = [lower + (upper - lower) * x for x in l]
>>> l_norm
[0.4808075, 0.43080805, 0.48458, 0.35861082499999997]

Answer 2

您可以将sklearn.preprocessing用于许多类型的预处理任务，包括规范化。

Answer 3

以下函数考虑了一般情况：

def normalize(values, bounds):
    return [bounds['desired']['lower'] + (x - bounds['actual']['lower']) * (bounds['desired']['upper'] - bounds['desired']['lower']) / (bounds['actual']['upper'] - bounds['actual']['lower']) for x in values]

使用：

normalize(
    [0.92323, 0.7232322, 0.93832, 0.4344433],
    {'actual': {'lower': 0, 'upper': 1}, 'desired': {'lower': 0.25, 'upper': 0.5}}
) # [0.4808075, 0.43080805, 0.48458, 0.35861082499999997]

normalize(
    [5, 7.5, 10, 12.5, 15],
    {'actual':{'lower':5,'upper':15},'desired':{'lower':1,'upper':2}}
) # [1.0, 1.25, 1.5, 1.75, 2.0]

我选择了一个两级dict作为参数，但你可以用多种方式给它，例如在两个单独的元组中，一个用于实际边界，另一个用于期望，是下边界的第一个元素和第二个上层：

def normalize(values, actual_bounds, desired_bounds):
    return [desired_bounds][0] + (x - actual_bounds[0]) * (desired_bounds[1] - desired_bounds[0]) / (actual_bounds[1] - actual_bounds[0]) for x in values]

使用：

   normalize(
    [0.92323, 0.7232322, 0.93832, 0.4344433],
    (0,1),
    (0.25,0.5)
) # [0.4808075, 0.43080805, 0.48458, 0.35861082499999997]

normalize(
    [5, 7.5, 10, 12.5, 15],
    (5,15),
    (1,2)
) # [1.0, 1.25, 1.5, 1.75, 2.0]

Answer 4

其他帖子的建议方法不适用于大量实例。要规范列表中的值，您需要确定连接点（a，b），（c，d）的线的功能，其中“ a”是列表中最大的数字，“ b”是较大的数字在两个范围值中，“ c”是列表中最小的数字，而“ d”是两个范围值中的较小者。确定直线方程后，您可以将列表的值插入方程并求解。

我已经为上述转换编写了代码：

def normalize(list, range): # range should be (lower_bound, upper_bound)
  l = np.array(list) 
  a = np.max(l)
  c = np.min(l)
  b = range[1]
  d = range[0]

  m = (b - d) / (a - c)
  pslope = (m * (l - c)) + d
  return pslope

将数据规范化为特定范围的值

4 个答案: