Question

初始上下文是，我正在使用“for循环”并生成一些随机数据（使用下面显示的某些逻辑），然后将该数据写入字典中的键（'server_avg_response_time'）（'data_dict'）。最后，这是一个字典列表（'data_rows'）并将整体写入CSV。

用于生成随机数据的代码段：

server_avg_response_time_alert = "low"
for i in range(0,no_of_rows):
    if (random.randint(0,10) != 7 and server_avg_response_time_alert != "high"):
            data_dict['server_avg_response_time'] = random.randint(1,800000)

    else:
            if(server_avg_response_time_alert == "low"):
                    print "***ALERT***"
                    server_avg_response_time_alert = "high"

            data_dict['server_avg_response_time'] = random.randint(600000,800000)
            server_avg_response_time_period = random.randint(1,1000)
            if(server_avg_response_time_period > 980):
                    print "***ALERT OFF***"
                    server_avg_response_time_alert = "low"


    data_rows.insert(i,data_dict.copy())

这需要花费大量时间（生成大约30万行数据），因此我被要求寻找Pandas（快速生成数据）。现在，我正在尝试使用相同的逻辑来扩展数据帧。

问题：如果我将上面的代码放在一个函数中，我不能使用该函数将数据填入数据帧的列中吗？将此数据编入一列数据帧的最佳方法是什么？我相信如果在随机生成数据后直接将数据放入数据帧，我也不需要字典（键）。但是不知道这样做的语法。

Answer 1

尝试在函数中包装你的逻辑（for循环之后的所有内容），然后使用apply方法将其传递给一个空的pandas df，其中一列名为'avg_resp_time'（30000行），如下所示：< / p>

def randomLogic(value):
    random_value = 0 # logic goes here
    return random_value

df = pd.DataFrame(np.zeros(300000), columns=['server_avg_response_time'])

df['server_avg_response_time'] = df.server_avg_response_time.apply(randomLogic)

生成编程数据＆amp;从中创建数据框（生成数据到单列）

1 个答案: