如何为给定的关系表集设计dynamodb表

时间:2019-02-13 15:10:16

标签: amazon-dynamodb dynamodb-queries

我设置了3个关系表。我想将它们转换为dynamodb中的单个表。每个表托管不同tranType的数据。每个表都具有ID,tranDate作为其键。对于给定的ID,tranDate和tranType,有多行。

我的访问模式是获取给定ID和TranDate的数据-这将为我获取所有tranType的数据。

给定ID,tranDate的每个表中的行在400KB以内,但是如果我在3个表中添加给定ID和tranDate的行,则它将超过400KB。

Definitions
Table1
Id, tranDate,tranType,col1,col2,col3,col4
Table2
Id, tranDate,tranType,col1,col2,col3,col4,col5
Table3
Id, tranDate,tranType,col1,col2 

Table1 (Sample Data)
1, 2018-12-01,'DETAIL',12,13,14,'A'
1, 2018-12-01,'DETAIL',15,23,11,'B'
1, 2018-12-01,'DETAIL',17,33,24,'C'
1, 2018-12-01,'DETAIL',19,43,14,'D'
2, 2018-12-01,'DETAIL',11,13,14,'A1'
2, 2018-12-01,'DETAIL',12,23,11,'B1' 
1, 2018-11-01,'DETAIL',42,13,14,'X'
1, 2018-11-01,'DETAIL',45,23,11,'Y'
1, 2018-11-01,'DETAIL',47,33,24,'Z'

Table2 (Sample Data)
1, 2018-12-01,'SUMMARY',12,13,14,'A','S'
1, 2018-12-01,'SUMMARY',15,23,11,'B','B1'
2, 2018-12-01,'SUMMARY',17,33,24,'C','D1'
2, 2018-12-01,'SUMMARY',22,43,14,'D','D2'
2, 2018-12-01,'SUMMARY',33,13,14,'A1' ,'D3'

Table3 (Sample Data)
1, 2018-12-01,'GEO',11,'MI'
1, 2018-12-01,'GEO',12,'NY'
1, 2018-12-01,'GEO',11,'AL'
2, 2018-12-01,'GEO',14,'DE'
2, 2018-12-01,'GEO',15,'PA'

给定Id = 1,tranDate ='2018-12-01'-预期结果

1, 2018-12-01,'DETAIL',12,13,14,'A'
1, 2018-12-01,'DETAIL',15,23,11,'B'
1, 2018-12-01,'DETAIL',17,33,24,'C'
1, 2018-12-01,'DETAIL',19,43,14,'D' 

1, 2018-12-01,'SUMMARY',12,13,14,'A','S'
1, 2018-12-01,'SUMMARY',15,23,11,'B','B1' 

1, 2018-12-01,'GEO',11,'MI'
1, 2018-12-01,'GEO',12,'NY'
1, 2018-12-01,'GEO',11,'AL'  

1 个答案:

答案 0 :(得分:0)

根据您的描述,一种可能的设计是使用id + date的串联作为分区键,并使用交易类型作为排序键,可能将排序键与其他id之一组合。

>

因此,您的表可能如下所示:

 PK             |  TranType         |       data        
----------------+-------------------+------------------------------------------
"1:2018-12-01"  |  "DETAIL"         | ["12,13,14,A", "15,23,11,B",...]
"1:2018-12-01"  |  "SUMMARY"        | ["12,13,14,A,S", "15,23,11,B,B1"]
"1:2018-12-01"  |  "GEO"            | ["11,MI", "12,NY", "11,AL"]
"2:2018-12-01"  |  "DETAIL"         | [...] 
"2:2018-12-01"  |  "SUMMARY"        | [...] 
"2:2018-12-01"  |  "GEO"            | [...] 

假设数据有效负载不会太大,将无法正常工作。

另一种可能性是将数据进一步分解为离散属性,并为排序键创建一个复合列,该键由事务类型前缀和数据中的一个ID或仅由数字索引组成(这实际上取决于有关示例中其他列的真正含义)。

一个示例,假设col1在细节和摘要方面是唯一的,可能看起来像这样:

 PK             |  TTID             | states      | c2 | c3 | c4  | c5 |     
----------------+-------------------+-------------+----+----+-----+----+--
"1:2018-12-01"  |  "DETAIL:12"      |             | 13 | 14 | 'A' 
"1:2018-12-01"  |  "DETAIL:15"      |             | 23 | 11 | 'B' 
"1:2018-12-01"  |  "DETAIL:17"      |             | 33 | 24 | 'C' 
"1:2018-12-01"  |  "DETAIL:19"      |             | 43 | 14 | 'D' 
"1:2018-12-01"  |  "SUMMARY:12"     |             | 13 | 14 | 'A' | 'S'
"1:2018-12-01"  |  "SUMMARY:15"     |             | 23 | 11 | 'B' | 'B1'
"1:2018-12-01"  |  "GEO:11"         | ["MI","AL"] |
"1:2018-12-01"  |  "GEO:12"         | ["NY"]      |
"2:2018-12-01"  |  "DETAIL:11"      |             | 13 |14  | A1 
"2:2018-12-01"  |  "DETAIL:.."      |             |    ...  
"2:2018-12-01"  |  "SUMMARY:17"     |             |    ...  
"2:2018-12-01"  |  "SUMMARY:.."     |             |    ... 
"2:2018-12-01"  |  "GEO:.."         |  ...        |

这个问题没有一个答案。根据您拥有的数据以及如何访问数据来设计架构。