列出要分组的人
const arr = [
{
"Global Id": "1231",
"TypeID": "FD1",
"Size": 160,
"Flöde": 55,
},
{
"Global Id": "5433",
"TypeID": "FD1",
"Size": 160,
"Flöde": 100,
},
{
"Global Id": "50433",
"TypeID": "FD1",
"Size": 120,
"Flöde": 100,
},
{
"Global Id": "452",
"TypeID": "FD2",
"Size": 120,
"Flöde": 100,
},
]
函数输入,指定要分组的键:
const columns = [
{
"dataField": "TypeID",
"summarize": false,
},
{
"dataField": "Size",
"summarize": false,
},
{
"dataField": "Flöde",
"summarize": true,
},
]
预期输出:
const output = [
{
"TypeID": "FD1",
"Size": 160,
"Flöde": 155 // 55 + 100
"nrOfItems": 2
},
{
"TypeID": "FD1",
"Size": 120,
"Flöde": 100,
"nrOfItems": 1
},
{
"TypeID": "FD2",
"Size": 120,
"Flöde": 100,
"nrOfItems": 1
}
]
// nrOfItems adds up 4. 2 + 1 +1. The totalt nr of items.
功能:
const groupArr = (columns) => R.pipe(...);
"summarize"
属性指示该属性是否应该汇总。
数据集非常大,超过100k个项目。因此,我不想重复过多的操作。
我看过R.group
,但不确定在这里是否可以使用它?
也许与R.reduce
有关系?将组存储在累加器中,汇总值并添加计数(如果该组已经存在)?需要快速找到组,以便将组存储为密钥吗?
还是在这种情况下使用香草javascript更好?
答案 0 :(得分:2)
这是香草javascipt中的一个答案,因为我对Ramda API不太熟悉。我很确定该方法与Ramda十分相似。
代码中有注释,解释了每个步骤。我将尝试重写Ramda。
const arr=[{"Global Id":"1231",TypeID:"FD1",Size:160,"Flöde":55},{"Global Id":"5433",TypeID:"FD1",Size:160,"Flöde":100},{"Global Id":"50433",TypeID:"FD1",Size:120,"Flöde":100},{"Global Id":"452",TypeID:"FD2",Size:120,"Flöde":100}],columns=[{dataField:"TypeID",summarize:!1},{dataField:"Size",summarize:!1},{dataField:"Flöde",summarize:!0}];
// The columns that don't summarize
// give us the keys we need to group on
const groupKeys = columns
.filter(c => c.summarize === false)
.map(g => g.dataField);
// We compose a hash function that create
// a hash out of all the items' properties
// that are in our groupKeys
const groupHash = groupKeys
.map(k => x => x[k])
.reduce(
(f, g) => x => `${f(x)}___${g(x)}`,
() => "GROUPKEY"
);
// The columns that summarize tell us which
// properties to sum for the items within the
// same group
const sumKeys = columns
.filter(c => c.summarize === true)
.map(c => c.dataField);
// Again, we compose in to a single function.
// This function concats two items, taking the
// "last" item with only applying the sum
// logic for keys in concatKeys
const concats = sumKeys
.reduce(
(f, k) => (a, b) => Object.assign(f(a, b), {
[k]: (a[k] || 0) + b[k]
}),
(a, b) => Object.assign({}, a, b)
)
// Now, we take our data and group by the groupHash
const groups = arr.reduce(
(groups, x) => {
const k = groupHash(x);
if (!groups[k]) groups[k] = [x];
else groups[k].push(x);
return groups;
},
{}
);
// These are the keys we want our final objects to have...
const allKeys = ["nrTotal"]
.concat(groupKeys)
.concat(sumKeys);
// ...baked in to a helper to remove other keys
const cleanKeys = obj => Object.assign(
...allKeys.map(k => ({ [k]: obj[k] }))
);
// With the items neatly grouped, we can reduce each
// group using the composed concatenator
const items = Object
.values(groups)
.flatMap(
xs => cleanKeys(
xs.reduce(concats, { nrTotal: xs.length })
),
);
console.log(items);
这是尝试移植到Ramda的尝试,但是除了用Ramda等效物替换vanilla js方法之外,我没有其他更多的事情。好奇地看到我错过了哪些很棒的实用程序和功能概念!我敢肯定有人会对Ramda的细节有更深的了解!
const arr=[{"Global Id":"1231",TypeID:"FD1",Size:160,"Flöde":55},{"Global Id":"5433",TypeID:"FD1",Size:160,"Flöde":100},{"Global Id":"50433",TypeID:"FD1",Size:120,"Flöde":100},{"Global Id":"452",TypeID:"FD2",Size:120,"Flöde":100}],columns=[{dataField:"TypeID",summarize:!1},{dataField:"Size",summarize:!1},{dataField:"Flöde",summarize:!0}];
const [ sumCols, groupCols ] = R.partition(
R.prop("summarize"),
columns
);
const groupKeys = R.pluck("dataField", groupCols);
const sumKeys = R.pluck("dataField", sumCols);
const grouper = R.reduce(
(f, g) => x => `${f(x)}___${g(x)}`,
R.always("GROUPKEY"),
R.map(R.prop, groupKeys)
);
const reducer = R.reduce(
(f, k) => (a, b) => R.mergeRight(
f(a, b),
{ [k]: (a[k] || 0) + b[k] }
),
R.mergeRight,
sumKeys
);
const allowedKeys = new Set(
[ "nrTotal" ].concat(sumKeys).concat(groupKeys)
);
const cleanKeys = R.pipe(
R.toPairs,
R.filter(([k, v]) => allowedKeys.has(k)),
R.fromPairs
);
const items = R.flatten(
R.values(
R.map(
xs => cleanKeys(
R.reduce(
reducer,
{ nrTotal: xs.length },
xs
)
),
R.groupBy(grouper, arr)
)
)
);
console.log(items);
<script src="https://cdnjs.cloudflare.com/ajax/libs/ramda/0.26.1/ramda.min.js"></script>
答案 1 :(得分:2)
这是我最初的方法。除summarize
之外的所有内容都是辅助函数,我想如果您确实需要,可以内联。通过这种分离,我发现它更干净。
const getKeys = (val) => pipe (
filter (propEq ('summarize', val) ),
pluck ('dataField')
)
const keyMaker = (columns, keys = getKeys (false) (columns)) => pipe (
pick (keys),
JSON .stringify
)
const makeReducer = (
columns,
toSum = getKeys (true) (columns),
toInclude = getKeys (false) (columns),
) => (a, b) => ({
...mergeAll (map (k => ({ [k]: b[k] }), toInclude ) ),
...mergeAll (map (k => ({ [k]: (a[k] || 0) + b[k] }), toSum ) ),
nrOfItems: (a .nrOfItems || 0) + 1
})
const summarize = (columns) => pipe (
groupBy (keyMaker (columns) ),
values,
map (reduce (makeReducer (columns), {} ))
)
const arr = [{"Flöde": 55, "Global Id": "1231", "Size": 160, "TypeID": "FD1"}, {"Flöde": 100, "Global Id": "5433", "Size": 160, "TypeID": "FD1"}, {"Flöde": 100, "Global Id": "50433", "Size": 120, "TypeID": "FD1"}, {"Flöde": 100, "Global Id": "452", "Size": 120, "TypeID": "FD2"}]
const columns = [{"dataField": "TypeID", "summarize": false}, {"dataField": "Size", "summarize": false}, {"dataField": "Flöde", "summarize": true}]
console .log (
summarize (columns) (arr)
)
<script src="https://bundle.run/ramda@0.26.1"></script><script>
const {pipe, filter, propEq, pluck, pick, mergeAll, map, groupBy, values, reduce} = ramda</script>
Joe的解决方案有很多重叠之处,但也有一些实际差异。当我看到这个问题时,他已经被发布了,但是我希望自己的方法不受影响,所以直到我写完上面的内容,我才看。注意我们的哈希函数的区别。当Joe's创建JSON.stringify
时,Mine对{TypeID: "FD1", Size: 160}
之类的值执行"GROUPKEY___FD1___160"
。我想我更喜欢我的简单性。另一方面,在处理nrOfItems
方面,Joe的解决方案肯定比我的解决方案好。我在每次reduce
迭代中都进行了更新,必须使用|| 0
来处理初始情况。 Joe只是以已知的值开始弃牌。但总体而言,解决方案非常相似。
您提到要减少通过数据的次数。我编写Ramda代码的方式往往对此无济于事。此代码迭代整个列表,将其分组为相似的项,然后遍历每个组以折叠为单个值。 (在values
中可能还会有一个较小的迭代。)当然可以更改它们以合并这两个迭代。它甚至可能使代码更短。但是在我看来,这将变得更加难以理解。
我对单遍方法感到好奇,发现我可以使用为多遍方法构建的所有基础结构,仅重写主要功能:
const summarize2 = (columns) => (
arr,
makeKey = keyMaker (columns),
reducer = makeReducer (columns)
) => values (reduce (
(a, item, key = makeKey (item) ) => assoc (key, reducer (key in a ? a[key]: {}, item), a),
{},
arr
))
console .log (
summarize2 (columns) (arr)
)
除非测试表明此代码是我的应用程序的瓶颈,否则我不会选择原始代码。但这并没有我想的那么复杂,它一次迭代即可完成所有操作(好,values
除外。)有趣的是,它使我对{ {1}}。我的助手代码仅在此版本中可用,而我不必知道组的总大小。如果我使用乔的方法,那不会发生。