Question

我有这种情况：在名为treat_conv

的变量中计算出A的概率为0.1348

现在，我正在尝试使用此概率从原始数据框创建一个数据框，以带来一个指定的列。那可能吗？我正在尝试使用weights，但没有成功。也许我用错了吗？

遵循我的代码：

weights = np.array(treat_conv) #creating a array with treat_conv new_page_converted = df2.sample(n = treat_group.shape[0], weights=df2.converted(weights)) #creating new dataframe with the number of rows of treat_group and the column converted must have a 0.13 of chance to bring value 1

因此，如果我单独使用n，该代码将起作用。它使用正确的行数创建一个新的数据框。但是我无法获得正确的概率在converted栏中带来一定数量的值1。

我希望我的解释是不可理解的。谢谢！

Answer 1

您可以这样做

import pandas as pd
import numpy as np


df = pd.DataFrame(data=np.arange(0, 100, 1), columns=["SomeValue"])
selected = pd.DataFrame(data=np.random.choice(df["SomeValue"], int(len(df["SomeValue"]) * 0.13), replace=False),
                        columns=["SomeValue"])
selected["Trigger"] = 1
df = df.merge(selected, how="left", on="SomeValue")
df["Trigger"].fillna(0, inplace=True)

“ df”是您的原始DataFrame。然后，随机选择13％的值，并添加一列以表明已被选中。最后，将所有内容合并回原始数据框。

Dataframe.sample-权重-如何使用？

1 个答案: