Question

最近我在接受采访时被要求将字符串“aabbbccccddddd”转换为“a2b3c4d5”。目标是用一次出现和重复计数替换每个重复的字符。这里'a'在输入中重复两次，因此我们必须在输出中将其写为'a2'。另外，我需要编写一个函数来将格式反转回原始格式（例如从字符串“a2b3c4d5”到“aabbbccccddddd”）。我可以自由使用C或C ++。我写了下面的代码，但面试官似乎对此不太满意。他让我尝试一种比这更聪明的方式。

在下面的代码中，我使用formatstring()来消除重复的字符，只需添加重复的计数并使用reverseformatstring()转换回原始字符串。

void formatstring(char* target, const char* source) {
  int charRepeatCount = 1;
  bool isFirstChar = true;
  while (*source != '\0') {
    if (isFirstChar) {
      // Always add the first character to the target
      isFirstChar = false;
      *target = *source;
      source++; target++;
    } else {
      // Compare the current char with previous one,
      // increment repeat count
      if (*source == *(source-1)) {
        charRepeatCount++;
        source++;
      } else {
        if (charRepeatCount > 1) {
          // Convert repeat count to string, append to the target
          char repeatStr[10];
          _snprintf(repeatStr, 10, "%i", charRepeatCount);
          int repeatCount = strlen(repeatStr);
          for (int i = 0; i < repeatCount; i++) {
            *target = repeatStr[i];
            target++;
          }
          charRepeatCount = 1; // Reset repeat count
        }
        *target = *source;
        source++; target++;
      }
    }
  }
  if (charRepeatCount > 1) {
    // Convert repeat count to string, append it to the target
    char repeatStr[10];
    _snprintf(repeatStr, 10, "%i", charRepeatCount);
    int repeatCount = strlen(repeatStr);
    for (int i = 0; i < repeatCount; i++) {
      *target = repeatStr[i];
      target++;
    }
  }
  *target = '\0';
}

void reverseformatstring(char* target, const char* source) {
  int charRepeatCount = 0;
  bool isFirstChar = true;
  while (*source != '\0') {
    if (isFirstChar) {
      // Always add the first character to the target
      isFirstChar = false;
      *target = *source;
      source++; target++;
    } else {
      // If current char is alpha, add it to the target
      if (isalpha(*source)) {
        *target = *source;
        target++; source++;
      } else {
        // Get repeat count of previous character
        while (isdigit(*source)) {
          int currentDigit = (*source) - '0';
          charRepeatCount = (charRepeatCount == 0) ?
              currentDigit : (charRepeatCount * 10 + currentDigit);
          source++;
        }
        // Decrement repeat count as we have already written
        // the first unique char to the target
        charRepeatCount--; 
        // Repeat the last char for this count
        while (charRepeatCount > 0) {
          *target = *(target - 1);
          target++;
          charRepeatCount--;
        }
      }
    }
  }
  *target = '\0';
}

我没有发现上述代码有任何问题。还有其他更好的方法吗？

Answer 1

方法/算法很好，也许你可以稍微改进和缩小代码（通过做一些更简单的事情，没有必要以过于复杂的方式解决这个问题）。并选择实际上有意义的缩进样式。

C解决方案：

void print_transform(const char *input)
{
    for (const char *s = input; *s;) {
        char current = *s;
        size_t count = 1;
        while (*++s == current) {
            count++;
        }

        if (count > 1) {
            printf("%c%zu", current, count);
        } else {
            putc(current, stdout);
        }
    }

    putc('\n', stdout);
}

（这可以很容易地修改，以便它返回转换后的字符串，或者将其写入足够长的缓冲区。）

C ++解决方案：

std::string transform(const std::string &input)
{
    std::stringstream ss;
    std::string::const_iterator it = input.begin();

    while (it != input.end()) {
        char current = *it;
        std::size_t count = 1;
        while (++it != input.end() && *it == current) {
            count++;
        }

        if (count > 1) {
            ss << current << count;
        } else {
            ss << current;
        }
    }

    return ss.str();
}

Answer 2

由于其他几个人提出了非常合理的选择，我想就我认为你的基本问题提出一些意见：＆＃34;他让我尝试一种比这更聪明的方式....是还有其他更好的方法吗？＆＃34;

当我采访开发人员时，我正在寻找告诉我她如何解决问题的信号：

最重要的是，当H ₂ CO ₃时，正确性：代码是否有效？如果算法合理，我通常很乐意忽略小的语法错误（遗忘的分号，不匹配的parens或括号等）。
正确使用该语言，特别是如果候选人声称具有专业知识或具有丰富经验。他是否理解并恰当地使用成语来编写直接，简单的代码？
在她提出解决方案时，她可以解释一下她的思路吗？它是合乎逻辑且连贯的，还是一种霰弹枪方法？她能干并且愿意沟通好吗？
他是否会考虑边缘情况？如果是这样，内在算法是否处理它们，或者一切都是特例？如果最初的算法＆＃34;只是工作＆＃34;我是最开心的。对于所有情况，我认为从一个涵盖所有案例的冗长方法开始是完全可以接受的（或者只是添加一个＆＃34; TODO＆＃34;评论，注意到需要做更多的工作），以及然后简化以后，可能更容易注意到模式或重复的代码。
她是否考虑过错误处理？通常情况下，如果候选人首先询问她是否可以认为输入有效，或者使用类似的评论，＆＃34;如果这是生产代码，我会检查 x ，< em> y 和 z 问题，＆＃34;我会问她会做什么，然后建议她现在专注于一个工作算法，然后（也许）稍后回过头来。但如果候选人没有提及，我会很失望。
测试，测试，测试！候选人如何验证他的代码是否有效？他是否介绍了代码并建议测试用例，还是需要提醒他？测试用例是否合理？它们会覆盖边缘情况吗？
优化：作为最后一步，一切正常并经过验证后，我有时会问候选人是否可以改进她的代码。如果她在没有我的刺激的情况下提出建议，奖励积分;如果她在代码工作之前花了很多精力担心它，那就是负面因素。

将这些想法应用到您编写的代码中，我会做出这些观察：

适当地使用const是一个优点，因为它显示了对该语言的熟悉程度。在一次采访中，我可能会问一两个关于为何/何时使用它的问题。

在整个代码中正确使用char指针是个好兆头。我倾向于在比较中明确表达数据类型，特别是在访谈期间，所以我很高兴看到，例如 while (*source != '\0')而不是（普通的，正确的，但IMO不那么谨慎）while(*source)。

isFirstChar是一个红旗，基于我的边缘情况＆＃34;点。当你声明一个布尔值来跟踪代码的状态时，通常会有一种重新构建问题的方法来本质地处理这个条件。在这种情况下，您可以使用charRepeatCount来确定这是否是可能系列中的第一个字符，因此您不需要显式测试字符串中的第一个字符。

出于同样的原因，重复的代码也可以表示可以简化算法。一个改进是将charRepeatCount转换为单独的函数。请参阅下文以获得更好的解决方案。

很有趣，但我发现候选人很少在采访中为他们的代码添加评论。感谢有用的人，对于那些人的负面影响＆＃34;增加计数器＆＃34;这增加了没有信息的冗长。人们普遍认为，除非你做了一些奇怪的事情（在这种情况下你应该重新考虑你所写的内容），你应该假设阅读你的代码的人熟悉编程语言。所以评论应该解释你的思考过程，而不是将代码翻译成英文。

过多的嵌套条件或循环也可能是一个警告。您可以通过将每个字符与 next 一个而不是前一个字符进行比较来消除一级嵌套。这甚至适用于字符串中的最后一个字符，因为它将与终止空字符进行比较，该字符不会匹配，并且可以像任何其他字符一样对待。

有更简单的方法可以将charRepeatCount从int转换为字符串。例如，_snprintf()返回它打印的字节数＆＃34;到字符串，所以你可以使用
target += _snprintf(target, 10, "%i", charRepeatCount);

在反转功能中，您已经完美地使用了三元运算符......但是没有必要对零值进行特殊处理：无论数值如何，数学都是相同的。同样，还有标准的实用程序函数，如atoi()，它会将字符串的前导数字转换为整数。

经验丰富的开发人员通常会将增量或减量操作包含在循环中作为条件的一部分，而不是作为底部的单独语句：while(charRepeatCount-- > 0)。如果你使用slide operator：while (charRepeatCount --> 0)写这个，我会挑起眉毛，但会给你一点点幽默和个性。但只有你承诺不在生产中使用它。

祝你的面试好运！

Answer 3

我认为你的代码太复杂了。这是我的方法（使用C）：

#include <ctype.h>
#include <stdio.h>

void format_str(char *target, char *source) {
    int count;
    char last;
    while (*source != '\0') {
        *target = *source;
        last = *target;
        target++;
        source++;
        for (count = 1; *source == last; source++, count++)
            ; /* Intentionally left blank */
        if (count > 1)
            target += sprintf(target, "%d", count);
    }
    *target = '\0';
}

void convert_back(char *target, char *source) {
    char last;
    int val;
    while (*source != '\0') {
        if (!isdigit((unsigned char) *source)) {
            last = *source;
            *target = last;
            target++;
            source++;
        }
        else {
            for (val = 0; isdigit((unsigned char) *source); val = val*10 + *source - '0', source++)
                ; /* Intentionally left blank */
            while (--val) {
                *target = last;
                target++;
            }
        }
    }
    *target = '\0';
}

format_str压缩字符串，convert_back解压缩。

Answer 4

您的代码“有效”，但它不符合C ++中使用的一些常见模式。你应该：

使用std::string代替普通char* array（s）
将该字符串作为const reference传递以避免修改，因为您将结果写在其他地方;
使用C ++ 11功能，例如基于范围的循环和lambdas。

我认为面试官的目的是测试你处理C ++ 11标准的能力，因为算法本身非常简单。

Answer 5

也许面试官想测试你对现有标准库工具的了解。以下是我在C ++中的看法：

#include <string>
#include <sstream>
#include <algorithm>
#include <iostream>

typedef std::string::const_iterator Iter;

std::string foo(Iter first, Iter last)
{
    Iter it = first;
    std::ostringstream result;
    while (it != last) {
        it = std::find_if(it, last, [=](char c){ return c != *it; });
        result << *first << (it - first);
        first = it;
    }
    return result.str();    
}

int main()
{
    std::string s = "aaabbbbbbccddde";
    std::cout << foo(s.begin(), s.end());
}

空输入需要额外检查。

Answer 6

试试这个

std::string str="aabbbccccddddd";

for(int i=0;i<255;i++)
{
    int c=0;
    for(int j=0;j<str.length();j++)
    {
        if(str[j] == i)
            c++;
    }
    if(c>0)
    printf("%c%d",i,c);
}

Answer 7

我天真的做法：

void pack( char const * SrcStr, char * DstBuf ) {

    char const * Src_Ptr = SrcStr;
    char * Dst_Ptr = DstBuf;

    char c = 0;
    int RepeatCount = 1;

    while( '\0' != *Src_Ptr ) {

        c = *Dst_Ptr = *Src_Ptr;
        ++Src_Ptr; ++Dst_Ptr;

        for( RepeatCount = 1; *Src_Ptr == c; ++RepeatCount ) {
            ++Src_Ptr;
        }

        if( RepeatCount > 1 ) {
            Dst_Ptr += sprintf( Dst_Ptr, "%i", RepeatCount );
            RepeatCount = 1;
        }
    }

    *Dst_Ptr = '\0';
};

void unpack( char const * SrcStr, char * DstBuf ) {

    char const * Src_Ptr = SrcStr;
    char * Dst_Ptr = DstBuf;

    char c = 0;

    while( '\0' != *Src_Ptr ) {

        if( !isdigit( *Src_Ptr ) ) {
            c = *Dst_Ptr = *Src_Ptr;
            ++Src_Ptr; ++Dst_Ptr;

        } else {
            int repeat_count = strtol( Src_Ptr, (char**)&Src_Ptr, 10 );
            memset( Dst_Ptr, c, repeat_count - 1 );
            Dst_Ptr += repeat_count - 1;
        }
    }

    *Dst_Ptr = '\0';
};

但是如果面试官要求错误处理而不是解决方案变得更加复杂（而且丑陋）。我的便携式方法：

#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <ctype.h>

// for MSVC
#ifdef _WIN32
    #define snprintf sprintf_s
#endif

int pack( char const * SrcStr, char * DstBuf, size_t DstBuf_Size ) {

    int Err = 0;

    char const * Src_Ptr = SrcStr;
    char * Dst_Ptr = DstBuf;

    size_t SrcBuf_Size = strlen( SrcStr ) + 1;
    char const * SrcBuf_End = SrcStr + SrcBuf_Size;
    char const * DstBuf_End = DstBuf + DstBuf_Size;

    char c = 0;
    int RepeatCount = 1;

    // don't forget about buffers intercrossing
    if( !SrcStr || !DstBuf || 0 == DstBuf_Size \
        || (DstBuf < SrcBuf_End && DstBuf_End > SrcStr) ) {

        return 1;
    }

    // source string must contain no digits
    // check for destination buffer overflow
    while( '\0' != *Src_Ptr && Dst_Ptr < DstBuf_End - 1 \
        && !isdigit( *Src_Ptr ) && !Err ) {

        c = *Dst_Ptr = *Src_Ptr;
        ++Src_Ptr; ++Dst_Ptr;

        for( RepeatCount = 1; *Src_Ptr == c; ++RepeatCount ) {
            ++Src_Ptr;
        }

        if( RepeatCount > 1 ) {
            int res = snprintf( Dst_Ptr, DstBuf_End - Dst_Ptr - 1, "%i" \
                , RepeatCount );
            if( res < 0 ) {
                Err = 1;
            } else {
                Dst_Ptr += res;
                RepeatCount = 1;
            }
       }
    }

    *Dst_Ptr = '\0';

    return Err;
};

int unpack( char const * SrcStr, char * DstBuf, size_t DstBuf_Size ) {

    int Err = 0;

    char const * Src_Ptr = SrcStr;
    char * Dst_Ptr = DstBuf;

    size_t SrcBuf_Size = strlen( SrcStr ) + 1;
    char const * SrcBuf_End = SrcStr + SrcBuf_Size;
    char const * DstBuf_End = DstBuf + DstBuf_Size;

    char c = 0;

    // don't forget about buffers intercrossing
    // first character of source string must be non-digit
    if( !SrcStr || !DstBuf || 0 == DstBuf_Size \
        || (DstBuf < SrcBuf_End && DstBuf_End > SrcStr) || isdigit( SrcStr[0] ) ) {

        return 1;
    }

    // check for destination buffer overflow
    while( '\0' != *Src_Ptr && Dst_Ptr < DstBuf_End - 1 && !Err ) {

        if( !isdigit( *Src_Ptr ) ) {
            c = *Dst_Ptr = *Src_Ptr;
            ++Src_Ptr; ++Dst_Ptr;

        } else {
            int repeat_count = strtol( Src_Ptr, (char**)&Src_Ptr, 10 );
            if( !repeat_count || repeat_count - 1 > DstBuf_End - Dst_Ptr - 1 ) { 
                Err = 1;
            } else {
                memset( Dst_Ptr, c, repeat_count - 1 );
                Dst_Ptr += repeat_count - 1;
            }
        }
    }

    *Dst_Ptr = '\0';

    return Err;
};

int main() {

    char str[] = "aabbbccccddddd";
    char buf1[128] = {0};
    char buf2[128] = {0};

    pack( str, buf1, 128 );
    printf( "pack: %s -> %s\n", str, buf1 );

    unpack( buf1, buf2, 128 );
    printf( "unpack: %s -> %s\n", buf1, buf2 );

    return 0;
}

测试：http://ideone.com/Y7FNE3。也适用于MSVC。

Answer 8

尝试使用更少的样板：

#include <iostream>
#include <iterator>
#include <sstream>
using namespace std;

template<typename in_iter,class ostream>
void torle(in_iter i, ostream &&o)
{
        while (char c = *i++) {
                size_t n = 1;
                while ( *i == c )
                        ++n, ++i;
                o<<c<<n;
        }
}

template<class istream, typename out_iter>
void fromrle(istream &&i, out_iter o)
{
        char c; size_t n;
        while (i>>c>>n)
                while (n--) *o++=c;
}

int main()
{
    typedef ostream_iterator<char> to;
    string line; stringstream converted;
    while (getline(cin,line)) {
        torle(begin(line),converted);
        cout<<converted.str()<<'\n';
        fromrle(converted,ostream_iterator<char>(cout));
        cout<<'\n';
    }
}

使用C / C ++进行字符串格式化

8 个答案: