在起始值之间和最小/最大限制之间的随机游走系列

时间:2017-10-26 12:35:27

标签: python numpy random random-walk

如何在起始值之间生成随机游走数据  虽然没有超过最大值而没有超过最小值?

这是我尝试这样做但由于某种原因,有时系列会超过最大值或低于最小值。似乎开始和结束值都得到了尊重,但不是最小值和最大值。怎么解决这个问题?另外我想给波动的标准偏差,但不知道如何。我使用randomPerc进行波动,但这是错误的,因为我想指定std。

import numpy as np
import matplotlib.pyplot as plt

def generateRandomData(length,randomPerc, min,max,start, end):
    data_np = (np.random.random(length) - randomPerc).cumsum()
    data_np *= (max - min) / (data_np.max() - data_np.min())
    data_np += np.linspace(start - data_np[0], end - data_np[-1], len(data_np))
    return data_np

randomData=generateRandomData(length = 1000, randomPerc = 0.5, min = 50, max = 100, start = 66, end = 80)

## print values
print("Max Value",randomData.max())
print("Min Value",randomData.min())
print("Start Value",randomData[0])
print("End Value",randomData[-1])
print("Standard deviation",np.std(randomData))

## plot values
plt.figure()
plt.plot(range(randomData.shape[0]), randomData)
plt.show()
plt.close()

这是一个简单的循环,用于检查低于最小值或超过最大值的系列。这正是我想要避免的。该系列应在给定的最小值和最大值限制之间分配。

 ## generate 1000 series and check if there are any values over the maximum limit or under the minimum limit
    for i in range(1000):
        randomData = generateRandomData(length = 1000, randomPerc = 0.5, min = 50, max = 100, start = 66, end = 80)
        if(randomData.min() < 50):
            print(i, "Value Lower than Min limit")
        if(randomData.max() > 100):
            print(i, "Value Higher than Max limit")

4 个答案:

答案 0 :(得分:3)

当你对你的步行施加条件时,它不能被认为是纯粹随意的。无论如何,一种方法是迭代地生成遍历,并检查每次迭代的边界。但是如果你想要一个矢量化解决方案,那么它就是:

def bounded_random_walk(length, lower_bound,  upper_bound, start, end, std):
    assert (lower_bound <= start and lower_bound <= end)
    assert (start <= upper_bound and end <= upper_bound)

    bounds = upper_bound - lower_bound

    rand = (std * (np.random.random(length) - 0.5)).cumsum()
    rand_trend = np.linspace(rand[0], rand[-1], length)
    rand_deltas = (rand - rand_trend)
    rand_deltas /= np.max([1, (rand_deltas.max()-rand_deltas.min())/bounds])

    trend_line = np.linspace(start, end, length)
    upper_bound_delta = upper_bound - trend_line
    lower_bound_delta = lower_bound - trend_line

    upper_slips_mask = (rand_deltas-upper_bound_delta) >= 0
    upper_deltas =  rand_deltas - upper_bound_delta
    rand_deltas[upper_slips_mask] = (upper_bound_delta - upper_deltas)[upper_slips_mask]

    lower_slips_mask = (lower_bound_delta-rand_deltas) >= 0
    lower_deltas =  lower_bound_delta - rand_deltas
    rand_deltas[lower_slips_mask] = (lower_bound_delta + lower_deltas)[lower_slips_mask]

    return trend_line + rand_deltas

randomData = bounded_random_walk(1000, lower_bound=50, upper_bound =100, start=50, end=100, std=10)

您可以将其视为几何问题的解决方案。 trend_line正在关联您的startend点,并且lower_boundupper_bound定义了边距。 rand是随机游走,rand_trend是趋势线,rand_deltas是与rand趋势线的偏差。我们并列趋势线,并希望确保三角洲不超过利润率。当rand_deltas超过允许的保证金时,我们会&#34;折叠&#34;多余的回到边界。

最后,将结果随机增量添加到start=>end趋势线,从而接收所需的有界随机游走。

std参数对应于随机游走的方差量。

更新:修正断言

在这个版本&#34; std&#34;不承诺是&#34;间隔&#34;。

答案 1 :(得分:2)

我注意到你使用了内置函数作为参数(min和max),这些函数没有被推荐(我将它们更改为max_1和min_1)。除此之外,您的代码应该按预期工作:

def generateRandomData(length,randomPerc, min_1,max_1,start, end):
    data_np = (np.random.random(length) - randomPerc).cumsum()
    data_np *= (max_1 - min_1) / (data_np.max() - data_np.min())
    data_np += np.linspace(start - data_np[0], end - data_np[-1],len(data_np))
    return data_np
randomData=generateRandomData(1000, 0.5, 50, 100, 66, 80)

如果您愿意修改您的代码,这将有效:

import random
for_fill=[]
# generate 1000 samples within the specified range and save them in for_fill
for x in range(1000):
    generate_rnd_df=random.uniform(50,100)
    for_fill.append(generate_rnd_df)
#set starting and end point manually
for_fill[0]=60
for_fill[999]=80

答案 2 :(得分:1)

这是一种在代码中粗略表达的方式。

>>> import random
>>> steps = 1000
>>> start = 66
>>> end = 80
>>> step_size = (50,100)

确保生成1,000个步骤,以确保在所需范围内。

>>> crude_walk_steps = [random.uniform(*step_size) for _ in range(steps)]
>>> import numpy as np

将这些步骤转为步行,但请注意它们无法满足要求。

>>> crude_walk = np.cumsum(crude_walk_steps)
>>> min(crude_walk)
57.099056617839288
>>> max(crude_walk)
75048.948693623403

计算简单的线性变换以缩放步骤。

>>> from sympy import *
>>> var('a b')
(a, b)
>>> solve([57.099056617839288*a+b-66,75048.948693623403*a+b-80])
{b: 65.9893403510312, a: 0.000186686954219243}

缩放步骤。

>>> walk = [0.000186686954219243*_+65.9893403510312 for _ in crude_walk]

验证步行现在开始和停止的位置。

>>> min(walk)
65.999999999999986
>>> max(walk)
79.999999999999986

答案 3 :(得分:1)

您还可以生成随机漫游流并过滤掉那些不符合您约束的漫游。请注意,通过过滤它们不再是“随机”的。

下面的代码创建了一个无限的“有效”随机漫步流。小心 非常严格的限制,“下一次”调用可能需要一段时间;)。

import itertools
import numpy as np


def make_random_walk(first, last, min_val, max_val, size):
    # Generate a sequence of random steps of lenght `size-2`
    # that will be taken bewteen the start and stop values.
    steps = np.random.normal(size=size-2)

    # The walk is the cumsum of those steps
    walk = steps.cumsum()

    # Performing the walk from the start value gives you your series.
    series = walk + first

    # Compare the target min and max values with the observed ones.
    target_min_max = np.array([min_val, max_val])
    observed_min_max = np.array([series.min(), series.max()])

    # Calculate the absolute 'overshoot' for min and max values
    f = np.array([-1, 1])
    overshoot = (observed_min_max*f - target_min_max*f)

    # Calculate the scale factor to constrain the walk within the
    # target min/max values.
    # Don't upscale.
    correction_base = [walk.min(), walk.max()][np.argmax(overshoot)]
    scale = min(1, (correction_base - overshoot.max()) / correction_base)

    # Generate the scaled series
    new_steps = steps * scale
    new_walk = new_steps.cumsum()
    new_series = new_walk + first

    # Check the size of the final step necessary to reach the target endpoint.
    last_step_size = abs(last - new_series[-1]) # step needed to reach desired end

    # Is it larger than the largest previously observed step?
    if last_step_size > np.abs(new_steps).max():
        # If so, consider this series invalid.
        return None
    else:
        # Else, we found a valid series that meets the constraints.
        return np.concatenate((np.array([first]), new_series, np.array([last])))


start = 66
stop = 80
max_val = 100
min_val = 50
size = 1000

# Create an infinite stream of candidate series
candidate_walks = (
    (i, make_random_walk(first=start, last=stop, min_val=min_val, max_val=max_val, size=size))
    for i in itertools.count()
)
# Filter out the invalid ones.
valid_walks = ((i, w) for i, w in candidate_walks if w is not None)

idx, walk = next(valid_walks)  # Get the next valid series
print(
    "Walk #{}: min/max({:.2f}/{:.2f})"
    .format(idx, walk.min(), walk.max())
)