Question

给定一个没有破折号的uuid（v4），如何将其缩短为15个或少于15个字符的字符串？我也应该能够从15个字符串回到原始的uuid。

我正在尝试缩短它以将其发送到平面文件中，并且文件格式将此字段指定为15个字符的字母数字字段。鉴于缩短的uuid，我应该能够将它映射回原始的uuid。

这是我尝试过的，但绝对不是我想要的。

export function shortenUUID(uuidToShorten: string, length: number) {
  const uuidWithoutDashes = uuidToShorten.replace(/-/g , '');
  const radix = uuidWithoutDashes.length;
  const randomId = [];

  for (let i = 0; i < length; i++) {
    randomId[i] = uuidWithoutDashes[ 0 | Math.random() * radix];
  }
  return randomId.join('');
}

Answer 1

正如AuxTaco指出的那样，如果你的意思是＆＃34;字母数字＆＃34;如同匹配＆＃34; / ^ [A-Za-z0-9] {0,15} /＆＃34; （给出26 + 26 + 10 = 62的位数），那真的是不可能的。你不能在一加仑桶中装入3加仑的水而不会丢失一些东西。 UUID是128位，因此要将其转换为62的字符空间，您至少需要22个字符（log[base 62](2^128) == ~22）。

如果你对你的字符集更灵活，只需要15个unicode字符就可以放入文本文档，那么我的回答会有所帮助。

注意：这个答案的第一部分，我认为它的长度为16，而不是15.更简单的答案不会起作用。下面更复杂的版本仍然会。

为此，您需要使用某种双向压缩算法（类似于用于压缩文件的算法）。

然而，尝试压缩像UUID之类的问题是你可能会发生很多冲突。

UUID v4长度为32个字符（不带短划线）。它是十六进制的，所以它的字符空间是16个字符（0123456789ABCDEF）

这为您提供了16^32，大约3.4028237e+38或340,282,370,000,000,000,000,000,000,000,000,000,000的多种可能组合。为了使其在压缩后可恢复，您必须确保没有任何冲突（即，没有2个UUID变成相同的值）。这就是很多可能的值（这正是我们为UUID使用那么多的原因，2个随机UUID的可能性只有该数字中的1个）。

要将这么多可能性压缩到16个字符，您必须至少拥有尽可能多的可能值。使用16个字符时，您必须拥有256个字符（该大数字的根16，256^16 == 16 ^ 32`）。假设你有一个永远不会造成碰撞的算法。

确保永远不会发生冲突的一种方法是将其从base-16编号转换为base-256编号。这将给你一对一的关系，确保没有碰撞并使其完全可逆。通常，在JavaScript中切换基础很容易：parseInt(someStr, radix).toString(otherRadix)（例如parseInt('00FF', 16).toString(20)。不幸的是，JavaScript只能达到36的基数，因此我们必须自己进行转换。

如此庞大的基地的捕获代表它。您可以随意选择256个不同的字符，将它们放入一个字符串中，然后将其用于手动转换。但是，我不认为标准美国键盘上有256个不同的符号，即使您将大写和小写视为不同的字形。

更简单的解决方案是使用0到255 String.fromCharCode()的任意字符代码。

另一个小问题是，如果我们试图将所有这些都视为一个大数字，那么我们就会遇到问题，因为它确实很大，并且JavaScript无法正确表示它。

而不是那样，因为我们已经有十六进制，我们可以将它分成几对小数，转换它们，然后将它们吐出来。 32个十六进制数字= 16对，因此＆＃39; ll（巧合）是完美的。（如果你必须以任意大小解决这个问题，你必须做一些额外的数学运算并转换为将数字分成几部分，转换，然后重新组装。）

＆＃13;

const uuid = '1234567890ABCDEF1234567890ABCDEF';
const letters = uuid.match(/.{2}/g).map(pair => String.fromCharCode(parseInt(pair, 16)));
const str = letters.join('');
console.log(str);

＆＃13;

请注意，其中有一些随机字符，因为并非每个字符代码都映射到＆＃34; normal＆＃34;符号。如果您要发送的内容无法处理它们，则您需要使用数组方法：找到它可以处理的256个字符，创建它们的数组，而不是String.fromCharCode(num)，使用charset[num]。

要将其转换回来，您只需反过来：获取字符代码，转换为十六进制，将它们添加到一起：

＆＃13;

const uuid = '1234567890ABCDEF1234567890ABCDEF';

const compress = uuid => 
  uuid.match(/.{2}/g).map(pair => String.fromCharCode(parseInt(pair, 16))).join('');
  
const expand = str =>
  str.split('').map(letter => ('0' + letter.charCodeAt(0).toString(16)).substr(-2)).join('');
  
const str = compress(uuid);
const original = expand(str);

console.log(str, original, original.toUpperCase() === uuid.toUpperCase());

＆＃13;

为了好玩，以下是如何为任意输入库和输出库执行此操作。

这段代码有点混乱，因为它实际上已经扩展到使其更加不言自明，但它基本上完成了我上面描述的内容。

由于JavaScript没有无限的精确度，如果你最终转换一个非常大的数字（看起来像2.00000000e+10），那么e之后的每个数字都不会显示基本上被砍掉并换成零。为了解释这一点，你必须以某种方式解决它。

在下面的代码中，有一个简单的＆＃34;没有考虑到这一点的方式，所以只适用于较小的字符串，然后是适当的方式来解决它。我选择了一种简单但效率低下的方法，根据变成的数字来分解字符串。这不是最好的方法（因为数学并没有真正起作用），但它可以解决问题（以需要更小的字符集为代价）。

如果你真的需要将你的字符集大小保持在最低限度，你可以采用更智能的分裂机制。

＆＃13;

const smallStr = '1234';
const str = '1234567890ABCDEF1234567890ABCDEF';
const hexCharset = '0123456789ABCDEF'; // could also be an array
const compressedLength = 16;
const maxDigits = 16; // this may be a bit browser specific. You can make it smaller to be safer.

const logBaseN = (num, n) => Math.log(num) / Math.log(n);
const nthRoot = (num, n) => Math.pow(num, 1/n);
const digitsInNumber = num => Math.log(num) * Math.LOG10E + 1 | 0;
const partitionString = (str, numPartitions) => {
  const partsSize = Math.ceil(str.length / numPartitions);
  let partitions = [];
  for (let i = 0; i < numPartitions; i++) {
    partitions.push(str.substr(i * partsSize, partsSize));
  }
  return partitions;
}

console.log('logBaseN test:', logBaseN(256, 16) === 2);
console.log('nthRoot test:', nthRoot(256, 2) === 16);
console.log('partitionString test:', partitionString('ABCDEFG', 3));

// charset.length should equal radix
const toDecimalFromCharset = (str, charset) => 
    str.split('')
      .reverse()
      .map((char, index) => charset.indexOf(char) * Math.pow(charset.length, index))
      .reduce((sum, num) => (sum + num), 0);
      
const fromDecimalToCharset = (dec, charset) => {
  const radix = charset.length;
  let str = '';
  
  for (let i = Math.ceil(logBaseN(dec + 1, radix)) - 1; i >= 0; i--) {
    const part = Math.floor(dec / Math.pow(radix, i));
    dec -= part * Math.pow(radix, i);
    str += charset[part];
  }
  
  return str;
};

console.log('toDecimalFromCharset test 1:', toDecimalFromCharset('01000101', '01') === 69);
console.log('toDecimalFromCharset test 2:', toDecimalFromCharset('FF', hexCharset) === 255);
console.log('fromDecimalToCharset test:', fromDecimalToCharset(255, hexCharset) === 'FF');

const arbitraryCharset = length => new Array(length).fill(1).map((a, i) => String.fromCharCode(i));

// the Math.pow() bit is the possible number of values in the original
const simpleDetermineRadix = (strLength, originalCharsetSize, compressedLength) => nthRoot(Math.pow(originalCharsetSize, strLength), compressedLength);

// the simple ones only work for values that in decimal are so big before lack of precision messes things up
// compressedCharset.length must be >= compressedLength
const simpleCompress = (str, originalCharset, compressedCharset, compressedLength) =>
  fromDecimalToCharset(toDecimalFromCharset(str, originalCharset), compressedCharset);

const simpleExpand = (compressedStr, originalCharset, compressedCharset) =>
  fromDecimalToCharset(toDecimalFromCharset(compressedStr, compressedCharset), originalCharset);

const simpleNeededRadix = simpleDetermineRadix(str.length, hexCharset.length, compressedLength);
const simpleCompressedCharset = arbitraryCharset(simpleNeededRadix);
const simpleCompressed = simpleCompress(str, hexCharset, simpleCompressedCharset, compressedLength);
const simpleExpanded = simpleExpand(simpleCompressed, hexCharset, simpleCompressedCharset);

// Notice, it gets a little confused because of a lack of precision in the really big number.
console.log('Original string:', str, toDecimalFromCharset(str, hexCharset));
console.log('Simple Compressed:', simpleCompressed, toDecimalFromCharset(simpleCompressed, simpleCompressedCharset));
console.log('Simple Expanded:', simpleExpanded, toDecimalFromCharset(simpleExpanded, hexCharset));
console.log('Simple test:', simpleExpanded === str);

// Notice it works fine for smaller strings and/or charsets
const smallCompressed = simpleCompress(smallStr, hexCharset, simpleCompressedCharset, compressedLength);
const smallExpanded = simpleExpand(smallCompressed, hexCharset, simpleCompressedCharset);
console.log('Small string:', smallStr, toDecimalFromCharset(smallStr, hexCharset));
console.log('Small simple compressed:', smallCompressed, toDecimalFromCharset(smallCompressed, simpleCompressedCharset));
console.log('Small expaned:', smallExpanded, toDecimalFromCharset(smallExpanded, hexCharset));
console.log('Small test:', smallExpanded === smallStr);

// these will break the decimal up into smaller numbers with a max length of maxDigits
// it's a bit browser specific where the lack of precision is, so a smaller maxDigits
//  may make it safer
//
// note: charset may need to be a little bit bigger than what determineRadix decides, since we're
//  breaking the string up
// also note: we're breaking the string into parts based on the number of digits in it as a decimal
//  this will actually make each individual parts decimal length smaller, because of how numbers work,
//  but that's okay. If you have a charset just barely big enough because of other constraints, you'll
//  need to make this even more complicated to make sure it's perfect.
const partitionStringForCompress = (str, originalCharset) => {
    const numDigits = digitsInNumber(toDecimalFromCharset(str, originalCharset));
    const numParts = Math.ceil(numDigits / maxDigits);
    return partitionString(str, numParts);
}

const partitionedPartSize = (str, originalCharset) => {
    const parts = partitionStringForCompress(str, originalCharset);
    return Math.floor((compressedLength - parts.length - 1) / parts.length) + 1;
}

const determineRadix = (str, originalCharset, compressedLength) => {
    const parts = partitionStringForCompress(str, originalCharset);
    return Math.ceil(nthRoot(Math.pow(originalCharset.length, parts[0].length), partitionedPartSize(str, originalCharset)));
}

const compress = (str, originalCharset, compressedCharset, compressedLength) => {
    const parts = partitionStringForCompress(str, originalCharset);
    const partSize = partitionedPartSize(str, originalCharset);
    return parts.map(part => simpleCompress(part, originalCharset, compressedCharset, partSize)).join(compressedCharset[compressedCharset.length-1]);
}

const expand = (compressedStr, originalCharset, compressedCharset) =>
    compressedStr.split(compressedCharset[compressedCharset.length-1])
     .map(part => simpleExpand(part, originalCharset, compressedCharset))
     .join('');

const neededRadix = determineRadix(str, hexCharset, compressedLength);
const compressedCharset = arbitraryCharset(neededRadix);
const compressed = compress(str, hexCharset, compressedCharset, compressedLength);
const expanded = expand(compressed, hexCharset, compressedCharset);

console.log('String:', str, toDecimalFromCharset(str, hexCharset));
console.log('Neded radix size:', neededRadix); // bigger than normal because of how we're breaking it up... this could be improved if needed
console.log('Compressed:', compressed);
console.log('Expanded:', expanded);
console.log('Final test:', expanded === str);

＆＃13;

要使用上述内容专门回答问题，您可以使用：

const hexCharset = '0123456789ABCDEF';
const compressedCharset = arbitraryCharset(determineRadix(uuid, hexCharset));

// UUID to 15 characters
const compressed = compress(uuid, hexCharset, compressedCharset, 15);

// 15 characters to UUID
const expanded = expanded(compressed, hexCharset, compressedCharset);

如果任意中存在有问题的字符，则必须执行某些操作来过滤掉这些字符，或者对特定字符进行硬编码。只需确保所有功能都是确定性的（即每次都是相同的结果）。

如何将uuid缩短并扩展为15个或更少的字符

1 个答案: