我正在处理相同数据的两个视图的演示文稿,这些视图存储在具有这些字段的数据表中(不相关的字段未显示):
第一个视图是数据表本身:
第二个视图是类似Pivot的表,其中行由相同的set_id
和user_id
定义,列由三个具有相同date
的集合定义,但存储不同的值:行。 Count,rows.Sum(value)和rows.Sum(volumes)。所以我写了这段代码:
Sub Test()
' Simulated Data
Dim dtDA = New DataTable
dtDA.Columns.Add("id", GetType(Integer))
dtDA.Columns.Add("set_id", GetType(Integer))
dtDA.Columns.Add("user_id", GetType(Integer))
dtDA.Columns.Add("date", GetType(Integer))
dtDA.Columns.Add("value", GetType(Integer))
dtDA.Columns.Add("volumes", GetType(Integer))
dtDA.Rows.Add(1001, 1, 11, 20160505, 1, 1)
dtDA.Rows.Add(1002, 1, 12, 20160505, 1, 2)
dtDA.Rows.Add(1003, 1, 13, 20160505, 1, 1)
dtDA.Rows.Add(1004, 1, 11, 20160505, 1, 1)
dtDA.Rows.Add(1005, 2, 14, 20160505, 1, 1)
dtDA.Rows.Add(1006, 2, 15, 20160505, 1, 2)
dtDA.Rows.Add(1007, 2, 16, 20160505, 1, 1)
dtDA.Rows.Add(1008, 2, 14, 20160505, 1, 1)
dtDA.Rows.Add(1009, 1, 12, 20160512, 1, 1)
dtDA.Rows.Add(1010, 1, 13, 20160512, 1, 2)
dtDA.Rows.Add(1011, 1, 11, 20160512, 1, 1)
dtDA.Rows.Add(1012, 1, 12, 20160512, 1, 1)
dtDA.Rows.Add(1013, 2, 15, 20160512, 1, 1)
dtDA.Rows.Add(1014, 2, 16, 20160512, 1, 2)
dtDA.Rows.Add(1015, 2, 14, 20160512, 1, 1)
dtDA.Rows.Add(1016, 2, 15, 20160512, 1, 1)
'Analysis
Dim DS = dtDA.Select.GroupBy(
Function(dr) New With {
.set_id = dr.Field(Of Integer?)("set_id").GetValueOrDefault,
.user_id = dr.Field(Of Integer?)("user_id").GetValueOrDefault,
.date = dr.Field(Of Integer?)("date").GetValueOrDefault}
).GroupBy(
Function(gSRD) New With {
.set_id = gSRD.Key.set_id,
.user_id = gSRD.Key.user_id})
Dim dtDS As New DataTable, drDS As DataRow
dtDS.Columns.Add("set_id", GetType(Integer))
dtDS.Columns.Add("user_id", GetType(String))
dtDS.Columns.Add("date", GetType(Integer))
If DS.Any Then
For Each d In DS.SelectMany(Function(g) g.Select(Function(gg) gg.Key.date)).Distinct.OrderBy(Function(i) i)
dtDS.Columns.Add("c" & d, GetType(Integer))
dtDS.Columns.Add("p" & d, GetType(Integer))
dtDS.Columns.Add("v" & d, GetType(Integer))
Next
For Each gSR In DS
drDS = dtDS.NewRow
drDS.SetField("set_id", gSR.Key.set_id)
drDS.SetField("user_id", gSR.Key.user_id)
For Each gSRD In gSR
drDS.SetField("c" & gSRD.Key.date, gSRD.Count)
drDS.SetField("p" & gSRD.Key.date, gSRD.Sum(Function(dr) dr.Field(Of Integer?)("value").GetValueOrDefault))
drDS.SetField("v" & gSRD.Key.date, gSRD.Sum(Function(dr) dr.Field(Of Integer?)("volumes").GetValueOrDefault))
Next
dtDS.Rows.Add(drDS)
Next
End If
End Sub
我期待得到这个:
set_id|user_id|c20160505|p20160505|v20160505|c20160512|p20160512|v20160512
------+-------+---------+---------+---------+---------+---------+---------
1| 11| 2| 2| 2| 1| 1| 1
1| 12| 1| 1| 2| 2| 2| 2
1| 13| 1| 1| 1| 1| 1| 2
2| 14| 2| 2| 2| 1| 1| 1
2| 15| 1| 1| 2| 2| 2| 2
2| 16| 1| 1| 1| 1| 1| 2
相反,我得到了这个:
任何人都可以帮助我并告诉我在这里做错了什么吗?非常感谢你!
编辑
问题似乎是我的代码基于这样一个前提:Linq按密钥分类为匿名类型的行为与https://stackoverflow.com/a/12124777/3718031中描述的一样,但遗憾的是它似乎并非如此。我注意到如果我通过连接字符串替换匿名类型,我的代码就会按照我的预期执行:
Dim DS = dtDA.Select.GroupBy(
Function(dr) dr.Field(Of Integer?)("set_id").GetValueOrDefault & "," & dr.Field(Of Integer?)("user_id").GetValueOrDefault & "," & dr.Field(Of Integer?)("date").GetValueOrDefault
).GroupBy(
Function(gSRD) gSRD.First.Field(Of Integer?)("set_id").GetValueOrDefault & "," & gSRD.First.Field(Of Integer?)("user_id").GetValueOrDefault)
Dim dtDS As New DataTable, drDS As DataRow
dtDS.Columns.Add("set_id", GetType(Integer))
dtDS.Columns.Add("user_id", GetType(String))
If DS.Any Then
For Each d In DS.SelectMany(Function(g) g.Select(Function(gg) gg.First.Field(Of Integer?)("date").GetValueOrDefault)).Distinct.OrderBy(Function(i) i)
dtDS.Columns.Add("c" & d, GetType(Integer))
dtDS.Columns.Add("p" & d, GetType(Integer))
dtDS.Columns.Add("v" & d, GetType(Integer))
Next
For Each gSR In DS
drDS = dtDS.NewRow
drDS.SetField("set_id", gSR.First.First.Field(Of Integer?)("set_id").GetValueOrDefault)
drDS.SetField("user_id", gSR.First.First.Field(Of Integer?)("user_id").GetValueOrDefault)
For Each gSRD In gSR
drDS.SetField("c" & gSRD.First.Field(Of Integer?)("date").GetValueOrDefault, gSRD.Count)
drDS.SetField("p" & gSRD.First.Field(Of Integer?)("date").GetValueOrDefault, gSRD.Sum(Function(dr) dr.Field(Of Integer?)("value").GetValueOrDefault))
drDS.SetField("v" & gSRD.First.Field(Of Integer?)("date").GetValueOrDefault, gSRD.Sum(Function(dr) dr.Field(Of Integer?)("volumes").GetValueOrDefault))
Next
dtDS.Rows.Add(drDS)
Next
End If
所以,剩下的问题是:为什么匿名类型不是基于每个属性的相等性来分组键?