Question

用多行// input number评论替换单行/* input number */条评论的好方法是什么？

我对用于完成任务的语言没有任何偏好;我在考虑Perl或sed。源语言为C（ANSI X3.159-1989）。

简单的脚本，如

while(<>) {
  if (m#^(.*?)//#) {
    print $1;
  } else {
    print $_;
  }
}

会被包含//的字符串欺骗，但不行。同样，多行注释中的//应该保持不变。

编辑：代码可以假设没有三字符。

这与replace C style comments by C++ style comments相反。它与Replacing // comments with /* comments */ in PHP类似（虽然接受的答案不能处理我提到的特殊情况，因此可以说是错误的。）

Answer 1

需要考虑的批次角落案例。 Stray //可以出现在字符串文字，字符常量（是的，真的）中，以及/* ... */条和 //条评论中。使用尾随\字符的行拼接可能会让事情变得混乱 - \可以表示为三字组??/。我严重怀疑我是否已经想到了所有这些。

如果您需要100％可靠的替换，您将不得不重现（或窃取！）C编译器的预处理器的一部分。

如果您不需要100％的可靠性，您可能会考虑进行简单的替换，然后将输入与输出进行比较并手动清除任何问题。（对于典型的代码，很可能没有，但你需要检查。）这种方法的实用性部分取决于你需要翻译多少代码。

大多数极端情况都会导致代码无法编译：

printf("Hello // world\n");

- ＆GT;

print("Hello /* world\n"); */

您也可以考虑这是否真的有必要。大多数C89 / C90编译器都支持//条评论，至少可选。

Answer 2

您可以使用boost :: wave lexer的输出将所有c ++样式注释替换为C样式注释。没有对边缘情况感到困扰。

 #include <iostream>
 #include <fstream>

 #include <boost/wave/cpplexer/cpp_lex_token.hpp>
 #include <boost/wave/cpplexer/cpp_lex_iterator.hpp>

 typedef boost::wave::cpplexer::lex_token<> token_type;
 typedef boost::wave::cpplexer::lex_iterator<token_type> token_iterator;
 typedef token_type::position_type position_type;

 int main()
 {
       const char* infile = "infile.h";
     const char* outfile = "outfile.h";
       std::string instr;
       std::stringstream outstrm;
       std::string cmt_str;
       std::ifstream instream(infile);
       std::ofstream outstream(outfile);

       if(!instream.is_open()) {
               std::cerr << "Could not open file: "<< infile<<"\n";
           }
     if(!outstream.is_open()) {
         std::cerr << "Could not open file: "<< outfile<<"\n";
     }

           instream.unsetf(std::ios::skipws);
           instr = std::string(std::istreambuf_iterator<char>(instream.rdbuf()),
                                        std::istreambuf_iterator<char>());

           position_type pos(infile);
           token_iterator  it = token_iterator(instr.begin(), instr.end(), pos,
            boost::wave::language_support(boost::wave::support_cpp|boost::wave::support_option_long_long));
           token_iterator end = token_iterator();

           boost::wave::token_id id = *it;

      while(it!=end) {
         //here you check the c++ style comments 
         if(id == boost::wave:: T_CPPCOMMENT) {
            std::cout<<"Found CPP COMMENT";
            cmt_str = it->get_value();
            cmt_str[0] = '/';
            cmt_str[1] = '*';
            //since the last token is the new_line token so replace the new line
            cmt_str[cmt_str.size()-1] = '*';
            cmt_str.push_back('/');
            //and then append the newline at the end of the string 
            cmt_str.push_back('\n');
            outstrm<<cmt_str;
         }
         else {
           outstrm<<it->get_value(); 
         }
         ++it;
         id = *it;
     }
     outstream<<outstrm;

     return 0;
}

有关进一步的文档，请参阅： http://www.boost.org/doc/libs/1_47_0/libs/wave/index.html

Answer 3

这不会涵盖100％的极端情况，但它涵盖了您在请求中提到的情况。

#!/usr/bin/env python

import re
from sys import stdin, stdout

for line in stdin.readlines():
  line = line[:-1] # Trim the newline
  stripped = re.sub(r'[\'"].*[\'"]', '', line) # Ignore strings
  stripped = re.sub(r'/\*.*\*/', '', stripped) # Ignore multi-line comments
  m = re.match(r'.*?//(.*)', stripped) # Only match actual C++-style comments 
  if m:
    offset = len(m.group(1)) + 2
    content = line[:offset*-1] # Get the original line sans comment
    print '%s/* %s */' % (content, m.group(1)) # Combine the two with C-style comments
  else:
    print line

用C89注释替换C ++单行注释

3 个答案: