如何在起始值之间生成随机游走数据 虽然没有超过最大值而没有超过最小值?
这是我尝试这样做但由于某种原因,有时系列会超过最大值或低于最小值。似乎开始和结束值都得到了尊重,但不是最小值和最大值。怎么解决这个问题?另外我想给波动的标准偏差,但不知道如何。我使用randomPerc
进行波动,但这是错误的,因为我想指定std。
import numpy as np
import matplotlib.pyplot as plt
def generateRandomData(length,randomPerc, min,max,start, end):
data_np = (np.random.random(length) - randomPerc).cumsum()
data_np *= (max - min) / (data_np.max() - data_np.min())
data_np += np.linspace(start - data_np[0], end - data_np[-1], len(data_np))
return data_np
randomData=generateRandomData(length = 1000, randomPerc = 0.5, min = 50, max = 100, start = 66, end = 80)
## print values
print("Max Value",randomData.max())
print("Min Value",randomData.min())
print("Start Value",randomData[0])
print("End Value",randomData[-1])
print("Standard deviation",np.std(randomData))
## plot values
plt.figure()
plt.plot(range(randomData.shape[0]), randomData)
plt.show()
plt.close()
这是一个简单的循环,用于检查低于最小值或超过最大值的系列。这正是我想要避免的。该系列应在给定的最小值和最大值限制之间分配。
## generate 1000 series and check if there are any values over the maximum limit or under the minimum limit
for i in range(1000):
randomData = generateRandomData(length = 1000, randomPerc = 0.5, min = 50, max = 100, start = 66, end = 80)
if(randomData.min() < 50):
print(i, "Value Lower than Min limit")
if(randomData.max() > 100):
print(i, "Value Higher than Max limit")
答案 0 :(得分:3)
当你对你的步行施加条件时,它不能被认为是纯粹随意的。无论如何,一种方法是迭代地生成遍历,并检查每次迭代的边界。但是如果你想要一个矢量化解决方案,那么它就是:
def bounded_random_walk(length, lower_bound, upper_bound, start, end, std):
assert (lower_bound <= start and lower_bound <= end)
assert (start <= upper_bound and end <= upper_bound)
bounds = upper_bound - lower_bound
rand = (std * (np.random.random(length) - 0.5)).cumsum()
rand_trend = np.linspace(rand[0], rand[-1], length)
rand_deltas = (rand - rand_trend)
rand_deltas /= np.max([1, (rand_deltas.max()-rand_deltas.min())/bounds])
trend_line = np.linspace(start, end, length)
upper_bound_delta = upper_bound - trend_line
lower_bound_delta = lower_bound - trend_line
upper_slips_mask = (rand_deltas-upper_bound_delta) >= 0
upper_deltas = rand_deltas - upper_bound_delta
rand_deltas[upper_slips_mask] = (upper_bound_delta - upper_deltas)[upper_slips_mask]
lower_slips_mask = (lower_bound_delta-rand_deltas) >= 0
lower_deltas = lower_bound_delta - rand_deltas
rand_deltas[lower_slips_mask] = (lower_bound_delta + lower_deltas)[lower_slips_mask]
return trend_line + rand_deltas
randomData = bounded_random_walk(1000, lower_bound=50, upper_bound =100, start=50, end=100, std=10)
您可以将其视为几何问题的解决方案。 trend_line
正在关联您的start
和end
点,并且lower_bound
和upper_bound
定义了边距。 rand
是随机游走,rand_trend
是趋势线,rand_deltas
是与rand
趋势线的偏差。我们并列趋势线,并希望确保三角洲不超过利润率。当rand_deltas
超过允许的保证金时,我们会&#34;折叠&#34;多余的回到边界。
最后,将结果随机增量添加到start=>end
趋势线,从而接收所需的有界随机游走。
std
参数对应于随机游走的方差量。
更新:修正断言
在这个版本&#34; std&#34;不承诺是&#34;间隔&#34;。
答案 1 :(得分:2)
我注意到你使用了内置函数作为参数(min和max),这些函数没有被推荐(我将它们更改为max_1和min_1)。除此之外,您的代码应该按预期工作:
def generateRandomData(length,randomPerc, min_1,max_1,start, end):
data_np = (np.random.random(length) - randomPerc).cumsum()
data_np *= (max_1 - min_1) / (data_np.max() - data_np.min())
data_np += np.linspace(start - data_np[0], end - data_np[-1],len(data_np))
return data_np
randomData=generateRandomData(1000, 0.5, 50, 100, 66, 80)
如果您愿意修改您的代码,这将有效:
import random
for_fill=[]
# generate 1000 samples within the specified range and save them in for_fill
for x in range(1000):
generate_rnd_df=random.uniform(50,100)
for_fill.append(generate_rnd_df)
#set starting and end point manually
for_fill[0]=60
for_fill[999]=80
答案 2 :(得分:1)
这是一种在代码中粗略表达的方式。
>>> import random
>>> steps = 1000
>>> start = 66
>>> end = 80
>>> step_size = (50,100)
确保生成1,000个步骤,以确保在所需范围内。
>>> crude_walk_steps = [random.uniform(*step_size) for _ in range(steps)]
>>> import numpy as np
将这些步骤转为步行,但请注意它们无法满足要求。
>>> crude_walk = np.cumsum(crude_walk_steps)
>>> min(crude_walk)
57.099056617839288
>>> max(crude_walk)
75048.948693623403
计算简单的线性变换以缩放步骤。
>>> from sympy import *
>>> var('a b')
(a, b)
>>> solve([57.099056617839288*a+b-66,75048.948693623403*a+b-80])
{b: 65.9893403510312, a: 0.000186686954219243}
缩放步骤。
>>> walk = [0.000186686954219243*_+65.9893403510312 for _ in crude_walk]
验证步行现在开始和停止的位置。
>>> min(walk)
65.999999999999986
>>> max(walk)
79.999999999999986
答案 3 :(得分:1)
您还可以生成随机漫游流并过滤掉那些不符合您约束的漫游。请注意,通过过滤它们不再是“随机”的。
下面的代码创建了一个无限的“有效”随机漫步流。小心 非常严格的限制,“下一次”调用可能需要一段时间;)。
import itertools
import numpy as np
def make_random_walk(first, last, min_val, max_val, size):
# Generate a sequence of random steps of lenght `size-2`
# that will be taken bewteen the start and stop values.
steps = np.random.normal(size=size-2)
# The walk is the cumsum of those steps
walk = steps.cumsum()
# Performing the walk from the start value gives you your series.
series = walk + first
# Compare the target min and max values with the observed ones.
target_min_max = np.array([min_val, max_val])
observed_min_max = np.array([series.min(), series.max()])
# Calculate the absolute 'overshoot' for min and max values
f = np.array([-1, 1])
overshoot = (observed_min_max*f - target_min_max*f)
# Calculate the scale factor to constrain the walk within the
# target min/max values.
# Don't upscale.
correction_base = [walk.min(), walk.max()][np.argmax(overshoot)]
scale = min(1, (correction_base - overshoot.max()) / correction_base)
# Generate the scaled series
new_steps = steps * scale
new_walk = new_steps.cumsum()
new_series = new_walk + first
# Check the size of the final step necessary to reach the target endpoint.
last_step_size = abs(last - new_series[-1]) # step needed to reach desired end
# Is it larger than the largest previously observed step?
if last_step_size > np.abs(new_steps).max():
# If so, consider this series invalid.
return None
else:
# Else, we found a valid series that meets the constraints.
return np.concatenate((np.array([first]), new_series, np.array([last])))
start = 66
stop = 80
max_val = 100
min_val = 50
size = 1000
# Create an infinite stream of candidate series
candidate_walks = (
(i, make_random_walk(first=start, last=stop, min_val=min_val, max_val=max_val, size=size))
for i in itertools.count()
)
# Filter out the invalid ones.
valid_walks = ((i, w) for i, w in candidate_walks if w is not None)
idx, walk = next(valid_walks) # Get the next valid series
print(
"Walk #{}: min/max({:.2f}/{:.2f})"
.format(idx, walk.min(), walk.max())
)