Question

我有一个如下所示的DataFrame

var str = '(redMultiplier=0, greenMultiplier=0, blueMultiplier=0, alphaMultiplier=1, redOffset=0, greenOffset=102, blueOffset=51, alphaOffset=0)'

var obj = {};

str.replace(/(\w+)=(\d+)/g, function (match, $1, $2){ 
    obj[$1] = parseInt($2);
});

console.log(obj);

当我收到字符串d = {'one': [1., 2., 3., 4.,5.,6], 'two': [4., 3., 2., 1.,-1,-2]} df = pd.DataFrame(d, index=['201305', '201305', '201307', '201307', '201307','201308'])时，我希望得到的最后一个值小于给定字符串‘201307’，‘201307’。

我该如何编写代码。

Answer 1

首先，当数字存储为字符串时，不要使用字符串。数值计算比字符串计算快得多。其次，这是一个容易解决的问题。只需对索引进行排序，然后检查：

df.index = df.index.astype(int)
df.sort_index(inplace=True)
df[df.index < int(given_value)].iloc[-1, :]

Answer 2

使用Index.drop_duplicates删除重复的条目并仅保留它与Index.get_loc一起遇到的第一个条目，以获取给定标签的整数位置以用作掩码。从中扣除1来获得它以前独特的标签。

>>> idx = df.index.drop_duplicates()
>>> val = idx[idx.get_loc('201307') - 1]    # <------ Insert query here
>>> val
'201305'

如果要获取给定index字符串值之前的最后一行：

>>> df.loc[val].iloc[-1]
one    2.0
two    3.0
Name: 201305, dtype: float64

使用arg method=bfill/backfill来处理不存在的匹配项。对于这种情况，它会立即获取下一个匹配的索引值。

>>> val = idx[idx.get_loc('201306', method='bfill') - 1]   # Here, '201307' is selected 
>>> val
'201305'

如何将最后一个值索引小于某个指定值

2 个答案: