给出下表
vals
0 20
1 3
2 2
3 10
4 20
我正在尝试在pandas中找到一个干净的解决方案来减去一个值,例如30
,以结束以下结果。
vals
0 0
1 0
2 0
3 5
4 20
我想知道大熊猫是否有解决方案来执行此操作,不需要循环遍历数据框中的所有行,这可以利用大熊猫的批量操作。
答案 0 :(得分:6)
c = df.vals.cumsum()
m = c.ge(30)
i = m.idxmax()
n = df.vals.where(m, 0)
n.loc[i] = c.loc[i] - 30
df.assign(vals=n)
vals
0 0
1 0
2 0
3 5
4 20
同样的事情,但是numpy
fied
v = df.vals.values
c = v.cumsum()
m = c >= 30
i = m.argmax()
n = np.where(m, v, 0)
n[i] = c[i] - 30
df.assign(vals=n)
vals
0 0
1 0
2 0
3 5
4 20
计时
%%timeit
v = df.vals.values
c = v.cumsum()
m = c >= 30
i = m.argmax()
n = np.where(m, v, 0)
n[i] = c[i] - 30
df.assign(vals=n)
10000 loops, best of 3: 168 µs per loop
%%timeit
c = df.vals.cumsum()
m = c.ge(30)
i = m.idxmax()
n = df.vals.where(m, 0)
n.loc[i] = c.loc[i] - 30
df.assign(vals=n)
1000 loops, best of 3: 853 µs per loop
答案 1 :(得分:4)
这里有一个使用NumPy的四行代码 -
v = df.vals.values
a = v.cumsum()-30
idx = (a>0).argmax()+1
v[:idx] = a.clip(min=0)[:idx]
示例运行 -
In [274]: df # Original df
Out[274]:
vals
0 20
1 3
2 2
3 10
4 20
In [275]: df.iloc[3,0] = 7 # Bringing in some variety
In [276]: df
Out[276]:
vals
0 20
1 3
2 2
3 7
4 20
In [277]: v = df.vals.values
...: a = v.cumsum()-30
...: idx = (a>0).argmax()+1
...: v[:idx] = a.clip(min=0)[:idx]
...:
In [278]: df
Out[278]:
vals
0 0
1 0
2 0
3 2
4 20
答案 2 :(得分:0)
String[] cats = {"childrens", "signed"};
// Combining the optional categories arrays
BasicDBList theMegaArray = new BasicDBList();
for (int i = 1; i <= 5; i++) {
String identifier = "categories.category" + i;
String cleanIdentifier = "$" + identifier;
theMegaArray.add(new BasicDBObject("$ifNull", Arrays.asList(cleanIdentifier, Collections.EMPTY_LIST)));
}
BasicDBObject theData = new BasicDBObject("$setUnion", theMegaArray);
// Add equals filter - Compare the arrays and output boolean filter field
BasicDBObject theFilter = new BasicDBObject("$eq", Arrays.asList(theData, cats));
// Add projections to keep the output fields
BasicDBObject theProjections = new BasicDBObject();
theProjections.put("filter", theFilter);
theProjections.put("pid", 1);
theProjections.put("categories", 1);
// Add $project stage
BasicDBObject theProject = new BasicDBObject("$project", theProjections);
// Add $match stage to compare the boolean filter field to true to keep matching documents
BasicDBObject theMatch = new BasicDBObject("$match", new BasicDBObject("filter", true));
// Add stages to piepline
BasicDBList pipeline = new BasicDBList();
pipeline.add(theProject);
pipeline.add(theMatch);
// Run aggregation
AggregateIterable iterable = collection.aggregate(pipeline);