Integer to UUID conversion using padded 0's

时间:2016-02-12 20:56:15

标签: java mysql cassandra uuid nosql

I have a question regarding UUID generation.

Typically, when I'm generating a UUID I will use a random or time based generation method.

HOWEVER, I'm migrating legacy data from MySQL over to a C* datastore and I need to change the legacy (auto-incrementing) integer IDs to UUIDS. Instead of creating another denormalized table with the legacy integer IDs as the primary key and all the data duplicated, I was wondering what folks thought about padding 0's onto the front of the integer ID to form a UUID. Example below.

*Something important to note is that the legacy IDs highest values will never top 1 million, so overflow isn't really an issue.

The idea would look like this:

Legacy ID: 123456 ---> UUID: 00000000-0000-0000-0000-000000123456

This would be done using some string concats and the UUID.fromString("00000000-0000-0000-0000-000000123456" method.

Does this seem like a bad pattern to anyone? I'm not a huge fan of the idea, gives me a bad taste in my mouth, but I don't have a technical reason for why haha.

As far as collisions go, the probability of a collision occurring is still ridiculously low. So I'm not worried about increasing collisions. I suppose it just seems like bad practice to me, that its "too easy".

2 个答案:

答案 0 :(得分:2)

We faced the same kind of issue before when migrating from Oracle with ids generated by sequence to Cassandra with generated UUIDs.

We had to design a type to both support old data coming from Oracle with type I have tried your example you need to put @Json creator annotation on your Items class constructor as well. Below is the modified code. class Item { private final String id; private final String type; private final String desc; @JsonCreator public Item( @JsonProperty("id")String id, @JsonProperty("type")String type, @JsonProperty("desc")String desc) { this.id = id; this.type = type; this.desc = desc; } } class MyObject { private final Map<String,Item> items; @JsonCreator public MyObject(@JsonProperty("items") Map<String, Item> items) { this.items = items; } and new data with long.

The obvious solution is to use type uuid to store the id. A blob can encode a blob or an long.

This solution only works for partition key because you query them using uuid. It won't work for clustering column using operators like = or > because we need an ordering on their value.

There was a small objection at that time, which was using a < to store the id makes it opaque to user, for example in cqlsh when you're doing a SELECT and you need to provide the id, how would you make a blob ?

Fortunately, the native functions of CQL blob, bigIntAsBlob(), blobAsBigInt() and uuidAsBlob() come in very handy.

答案 1 :(得分:0)

我决定从doanduyhai的答案走另一个方向。

为了保持数据的一致性,我们决定对数据进行完全去规范化,并在C *中创建另一个以我们的旧ID为中心的表。将对象从遗留系统迁移到C *时,会为它们分配一个新的随机生成的UUID,这将是它们未来的新主要ID。遗留ID将保持不变,直到我们决定不再需要它们为止。在那个时候,我们可以干净地删除遗留ID表并完成它们。

此解决方案允许我们在未来从旧版ID系统中获得更清晰的中断,并允许我们阻止使用奇怪的自定义UUID。我也不是将ID字段作为blob类型的忠实粉丝,可以在其中存储多种类型的数据,因为在未来,我们计划只希望UUID存在。