相当于numpy.put()在熊猫中

时间:2020-02-21 14:07:29

标签: pandas numpy

除了在放置某些字符串的某些索引处,我想创建一个NaN数组。

def to_padded_array(vals, idxs, length, fill_na):
    arr = np.array([fill_na] * length)
    np.put(arr, idxs, vals)
    ser = pd.Series(arr)

    return tuple(ser.tolist())

我有一些类似val和idx的示例:

idxs = np.array([0,4,5]) # this was made to be a numpy array
vals = pd.Series(['a', 'b', np.nan], name='city') # this actually would come from a pd.agg function

请注意,初始输入vals具有NaN如果尝试设置fill_na=np.nan,则会收到一条错误消息

could not convert string to float: 'a'

如果我使用fill_na=None,则同时获得NoneNaN,这不好

>>> to_padded_array(vals, idxs, length=6,fill_na=None)
('a', None, None, None, 'b', nan)

我当时正在考虑使用熊猫来规避此问题,但是我还没有找到与numpy.put对应的熊猫。我该怎么办?

1 个答案:

答案 0 :(得分:2)

您可以在此处使用Series.reindex

示例

def to_padded_array(vals, idxs, length):
    # Note that `vals` is a pd.Series object.
    ser = pd.Series(vals.values, index=idxs).reindex(np.arange(length))
    # if vals is an array, then vals can be used instead of vals.values 
    return tuple(ser.tolist())

to_padded_array(vals,idxs, 6)

[出]

('a', nan, nan, nan, 'b', nan)