使用分页/大块数字将字符串拆分为字符限制的块

时间:2018-06-22 09:33:32

标签: javascript split iteration

我正在尝试按包含分页/大块部分的空格将字符串分成字符限制的块。

例如,如果块字符限制为30,并且输入字符串为This is a string which should be split into two.,则该字符串应为...,由于字符数为48,所以应分成两部分(分页也会占用字符数),例如['This is a string which 1/2', 'should be split into two. 2/2']

到目前为止,这是我的代码:

function chunkify(message, characterLimit = 30) {
  if (message.length <= characterLimit) {
    return [message];
  }

  const words = message.split(' ');

  // Error if a word is longer than the character limit.
  if (words.some((word) => word.length > characterLimit)) {
    return 'Word is too long yo.';
  }

  const chunks = [];
  let currentChunk = [];
  
  // Get the chunks first to estimate the number of parts.
  // (Not sure if a separate loop is needed)
  words.forEach((word, i) => {
    if (currentChunk.join(' ').length + word.length > characterLimit) {
      chunks.push(currentChunk);
      currentChunk = [];
    }

    currentChunk.push(word);

    if (i === words.length - 1) {
      chunks.push(currentChunk);
    }
  });

  // Add the part number per chunk.
  for (let i = 0, length = chunks.length; i < length; i += 1) {
    const chunk = chunks[i];

    chunk[chunk.length] = `${i + 1}/${length}`;

    let itemsToMove = [];
    let isOverCharacterLimit = chunk.join(' ').length > characterLimit;

    // Check if words in the chunk need to be moved to the next chunk.
    while (isOverCharacterLimit) {
      itemsToMove = [...chunk.splice(chunk.length - 2, 1), ...itemsToMove];
      isOverCharacterLimit = chunk.join(' ').length > characterLimit;
    }

    if (itemsToMove.length) {
      // Modify the chunks array
      if (!chunks[i + 1]) {
        chunks[i + 1] = [];
        length = chunks.length;
      }

      chunks[i + 1] = [...itemsToMove, ...chunks[i + 1]];
    }
  }

  const output = chunks.map((chunk) => {
    return chunk.join(' ');
  });

  return output;
}

console.log(chunkify('Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris eu vestibulum purus. Praesent viverra, augue eu dapibus pulvinar, purus quam consequat neque, at euismod purus nunc ut diam. Sed in lectus vel lectus sodales ullamcorper. Pellentesque malesuada mi ut neque euismod, ac facilisis ligula malesuada. Nullam finibus suscipit enim nec laoreet. Vestibulum ornare, leo id dapibus semper, quam risus rutrum enim, vel suscipit odio felis consequat felis. Mauris et dolor nisl. Praesent sollicitudin auctor ultrices. Praesent libero sapien, ultrices vel purus et, feugiat bibendum nibh. Sed a luctus mi. Vivamus interdum posuere tellus nec cursus. Integer ut urna rutrum, sodales orci vel, fermentum nulla. Sed massa nibh, efficitur et tortor non, efficitur tristique sem.'));

输出很好,直到数组的原始长度为止。递归地遍历数组是唯一的解决方案吗?另一个问题是,如果更新的部件号的字符数增加,例如来自1/9 -> 1/10

4 个答案:

答案 0 :(得分:1)

以下是基于以下假设的解决方案:

  
      
  • 我们必须将字符串分成上限。

  •   
  • 我们不能在2个区块之间分割单词

  •   
  • 数据块大小还应包括分页计数

  •   

逻辑:

  • chunk可以具有:
    • 用空格分隔的单词
    • 分页,格式为x/n
  • chunk的最大长度也固定。
  • 因此,首先获取单词列表。您可以通过按空间拆分来实现。在以下解决方案中,我使用了/\s+/。这将忽略多个空格并计为1。
  • 计算可能的页面长度。您可以通过将字符串长度除以大小来实现。这是一个粗略的数字,但会告诉您它将保留多少位数。
  • 现在检查长度是否超过必要的长度。
    • 如果是,则在其上添加页码(虽然不添加最终计数。所以它看起来像... 1/ ),然后将计算出的字符串重置为word的值,然后再次从处理
    • 如果否,请检查是否为最后一次迭代。如果是,只需附加所有部分并将其推入数组即可。
    • 如果否,只需将单词附加在字符串上,然后进行下一次迭代即可。
  • 现在,一旦拥有所有块,就再次遍历它们,并将页面索引的其余部分添加到字符串并返回块。

以下是示例:

function chunkify(str, size) {
  var words = str.split(/\s+/);
  var chunks = [];
  var possiblePages = Math.ceil(str.length / size).toString();
  words.reduce((chuckStr, word, i, a) => {
    var pageIndex = ' ' + (chunks.length + 1) + '/';
    if ((chuckStr.length + word.length + pageIndex.length + possiblePages.length) + 1 > size) {
      chunks.push(chuckStr + pageIndex);
      chuckStr = word;
    } else if (i === a.length - 1) {
      chunks.push(chuckStr + " " + word + pageIndex);
    }else {
      chuckStr += " " + word;
    }
    return chuckStr
  }, '');
  return chunks.map(chunk => chunk + chunks.length.toString())
}

var sampleStr = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris eu vestibulum purus. Praesent viverra, augue eu dapibus pulvinar, purus quam consequat neque, at euismod purus nunc ut diam. Sed in lectus vel lectus sodales ullamcorper. Pellentesque malesuada mi ut neque euismod, ac facilisis ligula malesuada. Nullam finibus suscipit enim nec laoreet. Vestibulum ornare, leo id dapibus semper, quam risus rutrum enim, vel suscipit odio felis consequat felis. Mauris et dolor nisl. Praesent sollicitudin auctor ultrices. Praesent libero sapien, ultrices vel purus et, feugiat bibendum nibh. Sed a luctus mi. Vivamus interdum posuere tellus nec cursus. Integer ut urna rutrum, sodales orci vel, fermentum nulla. Sed massa nibh, efficitur et tortor non, efficitur tristique sem.'
console.log(chunkify(sampleStr, 30));

console.log(chunkify(sampleStr, 50));

答案 1 :(得分:0)

好吧,这个问题比第一次阅读要棘手得多。这是由于您可能会遇到一些极端情况,因为您的页码可能会导致重排。

因此,一个想法是尝试不同的页面大小,当然从/ 9开始,然后尝试/ 99,依此类推。

以下是我认为可以完成的方式的重写。我已经发表了评论,希望它很容易理解它的完成方式。

如果运行该代码段,您会注意到第一个示例是如何第一次通过的,这是因为它是第一个假设1/9的对象,find的工作原理并破坏了循环。

第二个示例将以1/9失败,它需要另一个字符进行编号,因此尝试输入1 / 99、2 / 99等。此方法有效,它将返回,因此第二个示例需要2遍。 >

const tests = [
  "This is a string which should be split into two.",
  "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris eu vestibulum purus. Praesent viverra, augue eu dapibus pulvinar, purus quam consequat neque, at euismod purus nunc ut diam. Sed in lectus vel lectus sodales"
];



function chunkify(message, characterLimit = 30) {
  //let's assume pages are 1/9 etc.
  let tmpPage = "9";
  while (true) {
    console.log(`Trying ${tmpPage} page size`);
    //lets split our letters into chucks.
    let chunks = message.split(" ");
    //keep check of lineno
    let lineNo = 1;
    //function to get a line up to max line length
    function getLine () {
      let ret = "";
      let anydone = false;
      var pg = `${lineNo}/${tmpPage}`;
      while (chunks.length) {
        let newline = ret + 
          (ret === "" ? chunks[0] : " " + chunks[0]);
        if (newline.length + pg.length +1 > 
          characterLimit) {
          break;
        }
        ret = newline;
        anydone = true;
        chunks.splice(0, 1);
      }
      //for safety lets make sure something was
      //done, or we will have a infinate loop, so
      //lets throw err, this could happen for instance
      //if a word was so long it would be impossible to keep
      //within line length
      if (!anydone) throw new Error("Can't do it");
      lineNo += 1;
      return ret;
    }
    const ret = [];
    //now while any chunks let get a line
    while (chunks.length) ret.push(getLine());
    if (ret.length.toString().length === tmpPage.length) {
      //everything should be ok, lets add back the real
      //page numbers and return
      return ret.map((i,ix) => 
        `${i} ${ix+1}/${ret.length}`);
    }
    //add another diget,..
    tmpPage += "9";
  }
}


for (const t of tests) {
  console.log(chunkify(t, 30));
}

答案 2 :(得分:0)

您可以仅使用行号来延迟长度部分(或者也可以省略)

//chunk[chunk.length] = i + 1;
// better
chunk.push(i + 1);

,然后在连接数组的映射部分添加长度。

const output = chunks.map((chunk, _, { length }) => {
    return chunk.join(' ') + '/' + length;
});

答案 3 :(得分:0)

这是我想出的解决方案:

function chunkify(message, characterLimit = 30, errorMessage = 'Word is too long yo.') {
  if (message.length <= characterLimit) {
    return [message];
  }

  const words = message.split(' ');
  
  if (words.some((word) => word.length > characterLimit)) {
    throw(errorMessage);
  }

  let chunks = [];
  let currentChunk = [];
  let wordsCopy = [...words];
  let currentPage = 1;
  let pageCount = 1;
  let pageCountLength = 1;
  let didPageCountLengthChange = false;

  const getPaginationTemplate = (currentPage, pageCount) => {
    let currentPagePlaceholder = '';
    let pageCountPlaceholder = '';

    for (let i = 0, length = currentPage.toString().length; i < length; i += 1) {
      currentPagePlaceholder += '-';
    }

    for (let i = 0, length = pageCount.toString().length; i < length; i += 1) {
      pageCountPlaceholder += '-';
    }

    return ` ${currentPagePlaceholder}/${pageCountPlaceholder}`;
  };

  while (wordsCopy.length) {
    // Do it again T_T
    if (didPageCountLengthChange) {
      chunks = [];
      currentChunk = [];
      wordsCopy = [...words];
      currentPage = 1;
      didPageCountLengthChange = false;
    }

    const nextWord = wordsCopy.shift();

    currentChunk.push(nextWord);

    let isOverCharacterLimit = currentChunk.join(' ').length + getPaginationTemplate(currentPage, pageCount).length > characterLimit;

    // Check if a word and the pagination won't fit in a chunk.
    if (currentChunk.length === 1 && isOverCharacterLimit) {
      throw(errorMessage);
    }

    if (isOverCharacterLimit) {
      // Return the word to the words array.
      wordsCopy.unshift(currentChunk.pop());

      // Add the current chunk to the chunks array.
      chunks.push(currentChunk);
      currentChunk = [];

      // Increment page. 
      currentPage += 1;
      pageCount += 1;

      // Check if the pagination character length has changed.
      if (pageCountLength !== pageCount.toString().length) {
        pageCountLength += 1;
        didPageCountLengthChange = true;
      }
    } else if (!wordsCopy.length) { // Add the current chunk if it's the last word.
      chunks.push(currentChunk);
    }
  }

  // Replace the pagination placeholders with actual pagination.
  const output = chunks.map((chunk, i, arr) => {
    chunk.push(`${i + 1}/${arr.length}`);

    return chunk.join(' ');
  });

  return output;
}

console.log(chunkify('Hello, World!'));
console.log(chunkify('The quick brown fox jumps over the lazy dog'));
console.log(chunkify('Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi auctor sed sem nec dictum. Quisque placerat vitae ligula ac interdum. Mauris massa massa, tincidunt non erat id, faucibus consectetur dolor. Interdum et malesuada fames ac ante ipsum primis in faucibus. Suspendisse viverra justo ante, sed lacinia velit porttitor in. Etiam tincidunt magna nec odio tempus dictum. Ut a elementum quam. Nunc venenatis lacus et nisi condimentum, non mattis erat dapibus. Nam maximus tempor est, eu lobortis lectus bibendum eget. Duis sollicitudin pharetra massa, et lobortis purus fermentum sed. Praesent a ornare massa. Duis consectetur ipsum eu auctor suscipit. Curabitur sagittis enim quis faucibus finibus. Nam vel nulla in libero accumsan vulputate id et massa. Vestibulum malesuada lacus sem, sit amet gravida mi laoreet vel. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos.'));
console.log(chunkify('Supercalifragilisticexpialidocious'));

只要分页字符计数(即x/9 -> x/10)发生变化,我就重新开始循环。我还没有完全测试它,但是它似乎可以工作。