我非常喜欢将Array.prototype.map
,filter
和reduce
联系起来以定义数据转换。不幸的是,在最近涉及大型日志文件的项目中,我无法再多次循环遍历我的数据......
我想创建一个链接.filter
和.map
方法的函数,而不是立即映射数组,组成一个循环数据一次的函数。即:
const DataTransformation = () => ({
map: fn => (/* ... */),
filter: fn => (/* ... */),
run: arr => (/* ... */)
});
const someTransformation = DataTransformation()
.map(x => x + 1)
.filter(x => x > 3)
.map(x => x / 2);
// returns [ 2, 2.5 ] without creating [ 2, 3, 4, 5] and [4, 5] in between
const myData = someTransformation.run([ 1, 2, 3, 4]);
受到this answer和this blogpost的启发,我开始编写Transduce
函数。
const filterer = pred => reducer => (acc, x) =>
pred(x) ? reducer(acc, x) : acc;
const mapper = map => reducer => (acc, x) =>
reducer(acc, map(x));
const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) => ({
map: map => Transduce(mapper(map)(reducer)),
filter: pred => Transduce(filterer(pred)(reducer)),
run: arr => arr.reduce(reducer, [])
});
上面Transduce
代码段的问题在于它“向后”运行...我链接的最后一个方法是第一个被执行的方法:
const someTransformation = Transduce()
.map(x => x + 1)
.filter(x => x > 3)
.map(x => x / 2);
// Instead of [ 2, 2.5 ] this returns []
// starts with (x / 2) -> [0.5, 1, 1.5, 2]
// then filters (x < 3) -> []
const myData = someTransformation.run([ 1, 2, 3, 4]);
或者,用更抽象的术语:
来自:
Transducer(concat).map(f).map(g) == (acc, x) => concat(acc, f(g(x)))
要:
Transducer(concat).map(f).map(g) == (acc, x) => concat(acc, g(f(x)))
类似于:
mapper(f) (mapper(g) (concat))
我想我理解为什么它会发生,但我无法弄清楚如何修改它而不改变我的功能的“界面”。
如何以正确的顺序进行Transduce
方法链filter
和map
操作?
Transduce
字词或者有更好的方法来描述问题,请告诉我。for
循环来做同样的事情:
const push = (acc, x) => (acc.push(x), acc);
const ActionChain = (actions = []) => {
const run = arr =>
arr.reduce((acc, x) => {
for (let i = 0, action; i < actions.length; i += 1) {
action = actions[i];
if (action.type === "FILTER") {
if (action.fn(x)) {
continue;
}
return acc;
} else if (action.type === "MAP") {
x = action.fn(x);
}
}
acc.push(x);
return acc;
}, []);
const addAction = type => fn =>
ActionChain(push(actions, { type, fn }));
return {
map: addAction("MAP"),
filter: addAction("FILTER"),
run
};
};
// Compare to regular chain to check if
// there's a performance gain
// Admittedly, in this example, it's quite small...
const naiveApproach = {
run: arr =>
arr
.map(x => x + 3)
.filter(x => x % 3 === 0)
.map(x => x / 3)
.filter(x => x < 40)
};
const actionChain = ActionChain()
.map(x => x + 3)
.filter(x => x % 3 === 0)
.map(x => x / 3)
.filter(x => x < 40)
const testData = Array.from(Array(100000), (x, i) => i);
console.time("naive");
const result1 = naiveApproach.run(testData);
console.timeEnd("naive");
console.time("chain");
const result2 = actionChain.run(testData);
console.timeEnd("chain");
console.log("equal:", JSON.stringify(result1) === JSON.stringify(result2));
const filterer = pred => reducer => (acc, x) =>
pred(x) ? reducer(acc, x) : acc;
const mapper = map => reducer => (acc, x) => reducer(acc, map(x));
const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) => ({
map: map => Transduce(mapper(map)(reducer)),
filter: pred => Transduce(filterer(pred)(reducer)),
run: arr => arr.reduce(reducer, [])
});
const sameDataTransformation = Transduce()
.map(x => x + 5)
.filter(x => x % 2 === 0)
.map(x => x / 2)
.filter(x => x < 4);
// It's backwards:
// [-1, 0, 1, 2, 3]
// [-0.5, 0, 0.5, 1, 1.5]
// [0]
// [5]
console.log(sameDataTransformation.run([-1, 0, 1, 2, 3, 4, 5]));
答案 0 :(得分:7)
在我们更了解之前
我真的很喜欢链接...
我明白了,我会安抚你,但你会明白,通过链接API强制你的程序是不自然的,并且在大多数情况下比它的价值更麻烦。
const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) => ({ map: map => Transduce(mapper(map)(reducer)), filter: pred => Transduce(filterer(pred)(reducer)), run: arr => arr.reduce(reducer, []) });
我想我理解为什么会这样,但我无法弄清楚如何修改它而不改变我的功能的“界面”。
问题确实在于您的Transduce
构造函数。您的map
和filter
方法在传感器链的外部堆叠map
和pred
,而不是将它们嵌套在里面。
下面,我已经实施了Transduce
API,以正确的顺序评估地图和过滤器。我还添加了log
方法,以便我们了解Transduce
的行为方式
const Transduce = (f = k => k) => ({
map: g =>
Transduce(k =>
f ((acc, x) => k(acc, g(x)))),
filter: g =>
Transduce(k =>
f ((acc, x) => g(x) ? k(acc, x) : acc)),
log: s =>
Transduce(k =>
f ((acc, x) => (console.log(s, x), k(acc, x)))),
run: xs =>
xs.reduce(f((acc, x) => acc.concat(x)), [])
})
const foo = nums => {
return Transduce()
.log('greater than 2?')
.filter(x => x > 2)
.log('\tsquare:')
.map(x => x * x)
.log('\t\tless than 30?')
.filter(x => x < 30)
.log('\t\t\tpass')
.run(nums)
}
// keep square(n), forall n of nums
// where n > 2
// where square(n) < 30
console.log(foo([1,2,3,4,5,6,7]))
// => [ 9, 16, 25 ]
未开发的潜力
受this answer启发......
在阅读我写的答案时,你忽略了Trans
的一般质量。在这里,我们的Transduce
只尝试使用数组,但实际上它可以使用任何具有空值([]
)和concat
方法的类型。这两个属性组成了一个名为Monoids的类别,如果我们没有利用传感器处理此类别中任何类型的能力,我们就会自行伤害。
上面,我们在[]
方法中对初始累加器run
进行了硬编码,但这应该作为参数提供 - 就像我们使用iterable.reduce(reducer, initialAcc)
除此之外,两种实现方式基本相同。最大的区别是,链接答案中提供的Trans
实现Trans
本身就是一个幺半群,但Transduce
不是。Trans
。 concat
在Transduce
方法中巧妙地实现了传感器的组合,而Trans
(上面)在每种方法中混合了组合。使它成为一个幺半群允许我们以与所有其他幺半群相同的方式合理化map
,而不必将其理解为具有唯一filter
,run
和{{1的专用链接接口方法。
我建议您从Trans
构建而不是制作自己的自定义API
有你的蛋糕,也吃它
因此,我们学到了统一界面的宝贵经验,我们理解Trans
本质上很简单。但是,你仍然想要那个甜蜜的链接API。好的,好的......
我们将再次实施Transduce
,但这一次我们将使用Trans
monoid执行此操作。在这里,Transduce
包含Trans
值而不是延续(Function
)。
其他所有内容保持不变 - foo
进行1 微小更改并生成相同的输出。
// generic transducers
const mapper = f =>
Trans(k => (acc, x) => k(acc, f(x)))
const filterer = f =>
Trans(k => (acc, x) => f(x) ? k(acc, x) : acc)
const logger = label =>
Trans(k => (acc, x) => (console.log(label, x), k(acc, x)))
// magic chaining api made with Trans monoid
const Transduce = (t = Trans.empty()) => ({
map: f =>
Transduce(t.concat(mapper(f))),
filter: f =>
Transduce(t.concat(filterer(f))),
log: s =>
Transduce(t.concat(logger(s))),
run: (m, xs) =>
transduce(t, m, xs)
})
// when we run, we must specify the type to transduce
// .run(Array, nums)
// instead of
// .run(nums)
展开此代码段以查看最终实施 - 当然您可以跳过定义单独的mapper
,filterer
和logger
,而是直接在{{1}上定义}。我认为这读得更好。
Transduce
结束
所以我们从乱七八糟的lambdas开始,然后使用monoid使事情更简单。 // Trans monoid
const Trans = f => ({
runTrans: f,
concat: ({runTrans: g}) =>
Trans(k => f(g(k)))
})
Trans.empty = () =>
Trans(k => k)
const transduce = (t, m, xs) =>
xs.reduce(t.runTrans((acc, x) => acc.concat(x)), m.empty())
// complete Array monoid implementation
Array.empty = () => []
// generic transducers
const mapper = f =>
Trans(k => (acc, x) => k(acc, f(x)))
const filterer = f =>
Trans(k => (acc, x) => f(x) ? k(acc, x) : acc)
const logger = label =>
Trans(k => (acc, x) => (console.log(label, x), k(acc, x)))
// now implemented with Trans monoid
const Transduce = (t = Trans.empty()) => ({
map: f =>
Transduce(t.concat(mapper(f))),
filter: f =>
Transduce(t.concat(filterer(f))),
log: s =>
Transduce(t.concat(logger(s))),
run: (m, xs) =>
transduce(t, m, xs)
})
// this stays exactly the same
const foo = nums => {
return Transduce()
.log('greater than 2?')
.filter(x => x > 2)
.log('\tsquare:')
.map(x => x * x)
.log('\t\tless than 30?')
.filter(x => x < 30)
.log('\t\t\tpass')
.run(Array, nums)
}
// output is exactly the same
console.log(foo([1,2,3,4,5,6,7]))
// => [ 9, 16, 25 ]
monoid提供了明显的优势,因为monoid接口是已知的,并且通用实现非常简单。但是我们很顽固,或者我们有目标要实现,而不是由我们设定 - 我们决定构建神奇的Trans
链式API,但我们使用我们坚如磐石的Transduce
幺半群来实现我们Trans
的所有力量,但也保持复杂性很好地区分。
dot chaining fetishists anonymous
以下是我写的关于方法链的其他几个答案
答案 1 :(得分:0)
我认为您需要更改实施的顺序:
const filterer = pred => reducer => (x) =>pred((a=reducer(x) )?x: undefined;
const mapper = map => reducer => (x) => map(reducer(x));
然后您需要将运行命令更改为:
run: arr => arr.reduce((a,b)=>a.concat([reducer(b)]), []);
默认的reducer必须是
x=>x
然而,这种方式过滤器不起作用。您可以在过滤器函数中抛出undefined并捕获run函数:
run: arr => arr.reduce((a,b)=>{
try{
a.push(reducer(b));
}catch(e){}
return a;
}, []);
const filterer = pred => reducer => (x) =>{
if(!pred((a=reducer(x))){
throw undefined;
}
return x;
};
然而,总而言之,我觉得for循环在这种情况下要优雅得多......