从Windows批处理文件解析URL

时间:2012-09-07 21:35:51

标签: vbscript batch-file

我有一个文本文件(myurls.txt),其内容是以下网址列表:

Slides_1:   http://linux.koolsolutions.com/svn/ProjectA/tags/REL-1.0
Exercise_1: http://linux.koolsolutions.com/svn/Linux/tags/REL-1.0

Slides_2:   http://linux.koolsolutions.com/svn/oldproject/ProjectB/tags/REL-2.0
Exercise_2: http://linux.koolsolutions.com/svn/ProjectB/tags/REL-1.0

Exercise_3: http://linux.koolsolutions.com/svn/BlueBook/ProjectA/tags/REL-1.0

现在我想在for循环中解析这个文本文件,以便在每次迭代后(例如从上面的文件中获取第一个url),我将以下信息分成不同的变量:

%i% = REL-1.0
%j% = http://linux.koolsolutions.com/svn/ProjectA
%k% = http://linux.koolsolutions.com/svn/ProjectA/tags/REL-1.0

经过一些实验后,我得到了以下代码,但只有在URL具有相同数量的斜杠时它才有效(

@echo off
set FILE=myurls.txt
FOR /F "tokens=2-9 delims=/ " %%i in (%FILE%) do (
@REM <do something with variables i, j and k.>
)

显然,我需要使它更灵活,以便它可以处理任意URL长度。我对其他解决方案很好,例如使用Windows Script Host / VBscript,只要它可以使用默认的Windows XP / 7安装运行。换句话说,我知道我可以在Windows上使用awk,grep,sed,python等来完成工作,但我不希望用户除了标准的Windows安装之外还要安装任何东西。

3 个答案:

答案 0 :(得分:4)

我认为这可能是您正在寻找的,但我并不完全确定您的规则是什么用于识别项目。

它使用FOR ~pnx修饰符来解析路径的某些部分。使用命令行中的HELP FOR获取更多信息。它使用\..\..到达祖父“目录”,\被添加到“路径”绝对位置。

结果将///转换为\,因此使用变量search和replace来恢复正确的斜杠分隔符,并使用子字符串操作去除前导斜杠。有关搜索和替换以及子字符串操作的更多信息,请使用命令行中的HELP SET

使用延迟扩展是因为它需要扩展在同一代码块中设置的变量。

@echo off
setlocal enableDelayedExpansion
set "file=myurls.txt"
for /f "tokens=1*" %%A in (%file%) do (
  for /f "delims=" %%C in ("\%%B\..\..") do (
    set "project=%%~pnxC"
    set "project=!project:~1!"
    set "project=!project:\=/!"
    set "project=!project:http:/=http://!"
    echo header  = %%A
    echo url     = %%B
    echo project = !project!
    echo release = %%~nxB
    echo(
  )
)

以下是您的示例数据的结果:

header  = Slides_1:
url     = http://linux.koolsolutions.com/svn/ProjectA/tags/REL-1.0
project = http://linux.koolsolutions.com/svn/ProjectA
release = REL-1.0

header  = Exercise_1:
url     = http://linux.koolsolutions.com/svn/ProjectA/tags/REL-1.0
project = http://linux.koolsolutions.com/svn/ProjectA
release = REL-1.0

header  = Slides_2:
url     = http://linux.koolsolutions.com/svn/oldproject/ProjectB/tags/REL-2.0
project = http://linux.koolsolutions.com/svn/oldproject/ProjectB
release = REL-2.0

header  = Exercise_2:
url     = http://linux.koolsolutions.com/svn/ProjectB/tags/REL-1.0
project = http://linux.koolsolutions.com/svn/ProjectB
release = REL-1.0

header  = Exercise_3:
url     = http://linux.koolsolutions.com/svn/BlueBook/ProjectA/tags/REL-1.0
project = http://linux.koolsolutions.com/svn/BlueBook/ProjectA
release = REL-1.0

答案 1 :(得分:1)

@echo off

:: First seperate into Label, URI type, and internet path
for /f "tokens=1-3 delims=:" %%x in (URLs.txt) do (
  echo.

  :: Take the Label
  for /f %%a in ("%%x") do set LabelNam=%%a

  :: Assemble Release URL
  set ReleaseURL=http:%%z

  :: Delayed variable expansion is required just for 'z'
  setlocal enabledelayedexpansion

    :: Take Release URL Path
    set z=%%z

    :: Extract the Release
    for /f "tokens=2" %%b in ("!z:/tags/= !") do set Release=%%b

    :: Split the Internet Path at the '/''s and call ':getURL'
    call :getURL %%y !z:/= !

    :: Output the information 
    echo       Label = !LabelNam!
    echo     Release = !Release!
    echo         URL = !URL!
    echo Release URL = !ReleaseURL!
  :: End variable expansion
  endlocal
)
goto :eof


:getURL
  :: Get URL type
  set URL=%1:/
  :: shift all arguments one to the left
  shift

  :URLloop
    :: Assemble URL
    set URL=%URL%/%1
    shift
  :: If we haven't fount 'tags' yet, loop
  if "%1" neq "tags" goto :URLloop

goto :eof

答案 2 :(得分:1)

好的,我最短,最容易理解,但评论最少的解决方案:

@echo off
for /f "tokens=1-3 delims=: " %%x in (URLs.txt) do (
  set LabelNam=%%x
  set ReleaseURL=%%y:%%z
  for /f "tokens=1-31 delims=/" %%a in ("%%y:%%z") do call :getURL %%a %%b %%c %%d %%e %%f %%g %%h %%i %%j %%k %%l %%m %%n %%o %%p %%q %%r %%s %%t %%u %%v
  echo.
  echo       Label = %LabelNam%
  echo     Release = %Release%
  echo         URL = %URL%
  echo Release URL = %ReleaseURL%
)
goto :eof

:getURL
  set URL=%1/
  shift
  :URLloop
    set URL=%URL%/%1
    shift
  if "%1" neq "tags" goto :URLloop
  Set Release=%2
goto :eof