我正在尝试构造一个正则表达式来从javascript代码中删除所有注释,包括单行(// ...)和多行(/*..*/)。这就是我想出来的:
/\"[^\"]*\"|'[^']*'|(\/\/.*$|\/\*[^\*]*\*\/)/mg
描述:正如您所看到的,它还搜索字符串文字。这是因为字符串文字可以包含否则可以匹配评论模式的内容(例如:location.href =“http://www.domain.com”;将匹配为单行注释)。所以我把字符串文字模式放在替代模式中。接下来是两种模式,分别用于捕获单行注释和多行注释。它们包含在同一个捕获组中,因此我可以使用string.replace(pattern,“”)来删除注释。
我用几个js文件测试了表达式,它似乎正在工作。 我的问题是,是否有其他模式我应该寻找或者是否还有其他需要考虑的事项(例如,如果在某些需要的浏览器中对正则表达式或替代实现的支持有限被视为)。
答案 0 :(得分:1)
使用C / C ++风格的评论剥离器 以下正则表达式做了这些事情
正则表达式有两个 表单 来进行格式保存:
\h
和换行符\n
构建[ \t]
和\r?\n
构建标志是多行和全球
替换是捕获组2,$2
或\2
。
表格1:
raw: ((?:(?:^\h*)?(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/(?:\h*\n(?=\h*(?:\n|/\*|//)))?|//(?:[^\\]|\\\n?)*?(?:\n(?=\h*(?:\n|/\*|//))|(?=\n))))+)|("(?:\\[\S\s]|[^"\\])*"|'(?:\\[\S\s]|[^'\\])*'|[\S\s][^/"'\\\s]*)
delimited: /((?:(?:^\h*)?(?:\/\*[^*]*\*+(?:[^\/*][^*]*\*+)*\/(?:\h*\n(?=\h*(?:\n|\/\*|\/\/)))?|\/\/(?:[^\\]|\\\n?)*?(?:\n(?=\h*(?:\n|\/\*|\/\/))|(?=\n))))+)|("(?:\\[\S\s]|[^"\\])*"|'(?:\\[\S\s]|[^'\\])*'|[\S\s][^\/"'\\\s]*)/mg
表格2:
raw: ((?:(?:^[ \t]*)?(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/(?:[ \t]*\r?\n(?=[ \t]*(?:\r?\n|/\*|//)))?|//(?:[^\\]|\\(?:\r?\n)?)*?(?:\r?\n(?=[ \t]*(?:\r?\n|/\*|//))|(?=\r?\n))))+)|("(?:\\[\S\s]|[^"\\])*"|'(?:\\[\S\s]|[^'\\])*'|(?:\r?\n|[\S\s])[^/"'\\\s]*)
delimited: /((?:(?:^[ \t]*)?(?:\/\*[^*]*\*+(?:[^\/*][^*]*\*+)*\/(?:[ \t]*\r?\n(?=[ \t]*(?:\r?\n|\/\*|\/\/)))?|\/\/(?:[^\\]|\\(?:\r?\n)?)*?(?:\r?\n(?=[ \t]*(?:\r?\n|\/\*|\/\/))|(?=\r?\n))))+)|("(?:\\[\S\s]|[^"\\])*"|'(?:\\[\S\s]|[^'\\])*'|(?:\r?\n|[\S\s])[^\/"'\\\s]*)/mg
扩展(使用this格式化)表单2 的版本:
( # (1 start), Comments
(?:
(?: ^ [ \t]* )? # <- To preserve formatting
(?:
/\* # Start /* .. */ comment
[^*]* \*+
(?: [^/*] [^*]* \*+ )*
/ # End /* .. */ comment
(?: # <- To preserve formatting
[ \t]* \r? \n
(?=
[ \t]*
(?: \r? \n | /\* | // )
)
)?
|
// # Start // comment
(?: # Possible line-continuation
[^\\]
| \\
(?: \r? \n )?
)*?
(?: # End // comment
\r? \n
(?= # <- To preserve formatting
[ \t]*
(?: \r? \n | /\* | // )
)
| (?= \r? \n )
)
)
)+ # Grab multiple comment blocks if need be
) # (1 end)
| ## OR
( # (2 start), Non - comments
"
(?: \\ [\S\s] | [^"\\] )* # Double quoted text
"
| '
(?: \\ [\S\s] | [^'\\] )* # Single quoted text
'
| (?: \r? \n | [\S\s] ) # Linebreak or Any other char
[^/"'\\\s]* # Chars which doesn't start a comment, string, escape,
# or line continuation (escape + newline)
) # (2 end)
答案 1 :(得分:0)
看看这段代码。虽然这是针对PHP的,但我认为这种模式是正确的。您可以调整JavaScript的模式。
答案 2 :(得分:0)
可以这样做(没有正则表达式纯javascript),但有一些限制。我确实为你实现了一些东西(25分钟)。使用的方法是逐行解析源文件。 如果你的js文件是正确的并且你没有3个例外,那么结果是正确的。
在此处查找植入:http://jsfiddle.net/ch14em6w/
这是代码关键部分:
//parse file input
function displayFileLineByLine(contents)
{
var lines = contents.split('\n');
var element = document.getElementById('file-content');
var output = '';
for(var line = 0; line < lines.length; line++){
var normedline = stripOut(lines[line]);
if (normedline.length > 0 )
{
output += normedline;
}
}
element.innerHTML = output;
}
// globa scope flag showing '/*' is open
var GlobalComentOpen = false;
//recursive line coments removal method
function stripOut(stringline, step){
//index global coment start
var igcS = stringline.indexOf('/*');
//index global coment end
var igcE = stringline.indexOf('*/');
//index inline coment pos
var iicP = stringline.indexOf('//');
var gorecursive = false;
if (igcS != -1)
{
gorecursive = true;
if (igcS < igcE) {
stringline = stringline.replace(stringline.slice(igcS, igcE +2), "");
}
else if (igcS > igcE && GlobalComentOpen) {
stringline = stringline.replace(stringline.slice(0, igcE +2), "");
igcS = stringline.indexOf('/*');
stringline = stringline.replace(stringline.slice(igcS, stringline.length), "");
}
else if (igcE == -1){
GlobalComentOpen = true;
stringline = stringline.replace(stringline.slice(igcS, stringline.length), "");
}
else
{
console.log('incorect format');
}
}
if (!gorecursive && igcE != -1)
{
gorecursive = true;
GlobalComentOpen = false;
stringline = stringline.replace(stringline.slice(0, igcE +2), "");
}
if (!gorecursive && iicP != -1)
{
gorecursive = true;
stringline = stringline.replace(stringline.slice(iicP, stringline.length), "");
}
if (!gorecursive && GlobalComentOpen && step == undefined)
{
return "";
}
if (gorecursive)
{
step = step == undefined ? 0 : step++;
return stripOut(stringline, step);
}
return stringline;
}
答案 3 :(得分:0)
import prettier from 'prettier';
function decomment(jsCodeStr) {
const options = { printWidth: 160, singleQuote: true, trailingComma: 'none' };
// actually strip comments:
options.parser = (text, { babel }) => {
const ast = babel(text);
delete ast.comments;
return ast;
};
return prettier.format(jsCodeStr, options);
}
答案 4 :(得分:0)
更新:这是 C# 代码,我认为这不是它的正确位置。无论如何,它来了。
我使用以下课程取得了不错的效果。
未使用字符串内的注释进行测试,例如
a = "hi /* comment */ there";
a = "hi there // ";
该类检测 // 行首或至少一个空格之后的注释。所以下面的工作。
a = "hi// there";
a = "hi//there";
这是代码
static public class CommentRemover
{
static readonly RegexOptions ROptions = RegexOptions.CultureInvariant | RegexOptions.IgnoreCase | RegexOptions.Multiline;
const string SSingleLineComments = @"\s//.*"; // comments with // in the beginning of a line or after a space
const string SMultiLineComments = @"/\*[\s\S]*?\*/";
const string SCommentPattern = SSingleLineComments + "|" + SMultiLineComments;
const string SEmptyLinePattern = @"^\s+$[\r\n]*";
static Regex CommentRegex;
static Regex EmptyLineRegex;
static public string RemoveEmptyLines(string Text)
{
if (EmptyLineRegex == null)
EmptyLineRegex = new Regex(SEmptyLinePattern, ROptions);
return EmptyLineRegex.Replace(Text, string.Empty);
}
static public string RemoveComments(string Text)
{
if (CommentRegex == null)
CommentRegex = new Regex(SCommentPattern, ROptions);
return CommentRegex.Replace(Text, string.Empty);
}
static public string RemoveComments(string Text, string Pattern)
{
Regex R = new Regex(Pattern, ROptions);
return R.Replace(Text, string.Empty);
}
static public string Execute(string Text)
{
Text = RemoveComments(Text);
Text = RemoveEmptyLines(Text);
return Text;
}
static public void ExecuteFile(string SourceFilePth, string DestFilePath)
{
string DestFolder = Path.GetDirectoryName(DestFilePath);
Directory.CreateDirectory(DestFolder);
string Text = File.ReadAllText(SourceFilePth);
Text = Execute(Text);
File.WriteAllText(DestFilePath, Text);
}
static public void ExecuteFolder(string FilePattern, string SourcePath, string DestPath, bool Recursive = true)
{
string[] FilePathList = Directory.GetFiles(SourcePath, FilePattern, Recursive? SearchOption.AllDirectories: SearchOption.TopDirectoryOnly);
string FileName;
string DestFilePath;
foreach (string SourceFilePath in FilePathList)
{
FileName = Path.GetFileName(SourceFilePath);
DestFilePath = Path.Combine(DestPath, FileName);
ExecuteFile(SourceFilePath, DestFilePath);
}
}
static public void ExecuteCommandLine(string[] Args)
{
void DisplayCommandLineHelp()
{
string Text = @"
-h, --help Flag. Displays this message. E.g. -h
-s, --source Source folder when the -p is present. Else source filename. E.g. -s C:\app\js or -s C:\app\js\main.js
-d, --dest Dest folder when the -p is present. Else dest filename. E.g. -d C:\app\js\out or -d C:\app\js\out\main.js
-p, --pattern The pattern to use when finding files. E.g. -p *.js
-r, --recursive Flag. Search in sub-folders too. E.g. -r
EXAMPLE
CommentStripper -s .\Source -d .\Dest -p *.js
";
Console.WriteLine(Text.Trim());
}
string Pattern = null;
string Source = null;
string Dest = null;
bool Recursive = false;
bool Help = false;
string Arg;
if (Args.Length > 0)
{
try
{
for (int i = 0; i < Args.Length; i++)
{
Arg = Args[i].ToLower();
switch (Arg)
{
case "-s":
case "--source":
Source = Args[i + 1].Trim();
break;
case "-d":
case "--dest":
Dest = Args[i + 1].Trim();
break;
case "-p":
case "--pattern":
Pattern = Args[i + 1].Trim();
break;
case "-r":
case "--recursive":
Recursive = true;
break;
case "-h":
case "--help":
Help = true;
break;
}
}
if (Help)
{
DisplayCommandLineHelp();
}
else
{
if (!string.IsNullOrWhiteSpace(Pattern))
{
ExecuteFolder(Pattern, Source, Dest, Recursive);
}
else
{
ExecuteFile(Source, Dest);
}
}
// Console.ReadLine();
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
Console.WriteLine();
DisplayCommandLineHelp();
}
}
}
}
祝你好运。