Pandas是否使用Numpy作为随机数生成器?

时间:2017-12-04 12:21:29

标签: python pandas numpy random random-seed

我希望获得可重现的数据样本。一项快速实验表明,numpy.random.seed确实影响了pandas.DataFrame.sample,但没有记录。

有人知道吗

我尝试了什么

我运行了以下几次并且总是得到相同的结果

#!/usr/bin/env python

import pandas as pd
import numpy as np


df = pd.DataFrame([(1, 2, 1),
                   (1, 2, 2),
                   (1, 2, 3),
                   (4, 1, 612),
                   (4, 1, 612),
                   (4, 1, 1),
                   (3, 2, 1),
                   ],
                  columns=['groupid', 'a', 'b'],
                  index=['India', 'France', 'England', 'Germany', 'UK', 'USA',
                         'Indonesia'])
np.random.seed(0)
print(df.sample(n=1))
print(df.sample(n=1))
print(df.sample(n=1))
print(df.sample(n=1))
print(df.sample(n=1))

给出了:

  • 印度尼西亚
  • 法国
  • 印度尼西亚
  • USA
  • 英国

1 个答案:

答案 0 :(得分:1)

pandas使用_random_state函数返回np.random.RandomState link

def _random_state(state=None):
    """
    Helper function for processing random_state arguments.
    Parameters
    ----------
    state : int, np.random.RandomState, None.
        If receives an int, passes to np.random.RandomState() as seed.
        If receives an np.random.RandomState object, just returns object.
        If receives `None`, returns np.random.
        If receives anything else, raises an informative ValueError.
        Default None.
    Returns
    -------
    np.random.RandomState
    """

    if types.is_integer(state):
        return np.random.RandomState(state)
    elif isinstance(state, np.random.RandomState):
        return state
    elif state is None:
        return np.random
    else:
        raise ValueError("random_state must be an integer, a numpy "
                         "RandomState, or None")

并在sample中称为此函数:

    # Process random_state argument
    rs = com._random_state(random_state)