从文本文件中提取字符串

时间:2014-12-18 12:04:35

标签: batch-file cmd findstr

我想从一个中等大小的文本文件(比如test.txt)中提取一组字符串到windows中的批处理。所以当我这样做时:

findstr  "ssh-rsa" test.txt >test.txt

我得到相同的test.txt作为输出。如何解决问题?

修改1:这是link of file at github,我需要line 522或更具体地75:d9:e3:5b:c8:17:ef:72:92:78:e5:8e:0c:82:7e:e1

搜索文字

编辑2:来自外部(可变)来源:

ci-info: ssh-rsa c1:e8:5c:66:c2:b0:6d:68:a7:94:fd:05:4a:26:79:b2 - ec2_key_us-west-1

ec2:
ec2: #############################################################
ec2: -----BEGIN SSH HOST KEY FINGERPRINTS-----
ec2: 1024 7a:e8:09:5e:0b:f4:cc:d5:75:38:60:bf:29:11:81:04 root@ip-10-0-0-55 (DSA)
ec2: 256 e6:25:23:8f:75:b4:c9:50:99:71:b7:11:4f:c6:40:52 root@ip-10-0-0-55 (ECDSA)
ec2: 2048 f2:df:0b:0d:f2:62:ab:c0:65:cf:65:04:1f:7d:9b:8a (RSA1)
ec2: 2048 75:d9:e3:5b:c8:17:ef:72:92:78:e5:8e:0c:82:7e:e1 root@ip-10-0-0-55 (RSA)
ec2: -----END SSH HOST KEY FINGERPRINTS-----
ec2: #############################################################
-----BEGIN SSH HOST KEY KEYS----- 

522行是第二个开始" ec2:2048" - 已发布的代码部分为第514..525行

3 个答案:

答案 0 :(得分:2)

插入新行不是一种选择,因为您首先需要搜索以确保不要分解您要查找的字符串 - 捕获22。

只要文件大小小于2千兆字节,就可以使用我的JREPL.BAT utility - 一个混合JScript /批处理脚本,它执行正则表达式搜索并替换文本文件。 REPL.BAT是纯脚本,可​​以在任何Windows机器上从XP开始本地运行。

您尚未说明您的确切搜索字词,因此我只会搜索ssh-rsa xx:nn:nn:nn。我使用\JMATCH选项将每个匹配放在一个新行上,并丢弃所有不匹配的内容。

jrepl "ssh-rsa [a-z][a-z]:\d\d:\d\d:\d\d" "$0" /jmatch /f test.txt /o result.txt

如果要覆盖原始文件,则

jrepl "ssh-rsa [a-z][a-z]:\d\d:\d\d:\d\d" "$0" /jmatch /f test.txt /o -

如果在批处理脚本中使用,请不要忘记使用call jrepl ...,因为JREPL也是批处理脚本。

答案 1 :(得分:0)

令人信服的问题:如何解析 结构良好的文本文件。尽管Magoo有说服力的警告,仍然有我的纯批次解决方案:

@ECHO OFF >NUL
SETLOCAL enableextensions enabledelayedexpansion
:: define a keyword
set "keyword=ssh-rsa"
:: create output file
type nul>testtest.txt
:: process input file line by line  
for /F "tokens=* delims=" %%F in ('findstr /I "%keyword%" test.txt') do ( 
  call :fooProc "%keyword%" "%%~F"
)
:: show result
type testtest.txt
ENDLOCAL
goto :eof

:: Procedure seeks for %~1 keyword in %~2 line and
:: for each found keyword writes the pair 'keyword value' 
:: to given output file, where 'value' = word next to 'keyword'.
:: Structure of 'line' supplied considered not well defined:
:: there could appear 0..n irrelevant words between
:: near-by 'keyword value' pairs and/or afore of first one
:: and/or by last one.
:: Suppose no presence of non-printable characters, or characters
:: with special meaning to batch processing (echo, call, if, ...)
:: like !, % or |, >, >>, < redirectors in the 'line' supplied.  
:fooProc
  SETLOCAL enableextensions enabledelayedexpansion
  set "foo=%~2"
:fooLoop
  for /F "tokens=1,2* delims= " %%G in ("!foo!") do ( 
    if /I "%%~G"=="%~1" (
      @echo %%G %%H>>testtest.txt
      set "foo= %%I"
    ) else (
      @rem echo         forgotten %%G
      set "foo=%%H %%I"
    )
  )
  if not "!foo!"==" " goto :fooLoop
  ENDLOCAL
goto :eof
:: end fooProc

用于测试的试用test.txt文件(有或没有尾随CrLf):

ssh-rsa ab:23:45:01 abc ssh-rsa ab:23:45:02 qwe rtz ssh-rsa ab:23:45:03 sedr fdert ssh-rsa ab:23:45:04 ssh-rsa ab:23:45:05 end01 end02 end03
2ndline ssh-rsa cd:23:45:21 abc 222 SSH-RSA cd:23:45:22 qwe rtz ssh-rsa cd:23:45:23 sedr fdert SSH-RSA cd:23:45:24 ssh-rsa cd:23:45:25 end21 end22 end23
3rdline ssh-rsa ef:23:45:31 333 cde ssh-rsa ef:23:45:32 qwe rtz ssh-rsa ef:23:45:33 sedr fdert ssh-rsa ef:23:45:34 ssh-rsa ef:23:45:35

答案 2 :(得分:0)

@ECHO OFF
SETLOCAL
SET "found_ssh_rsa="
FOR /f "tokens=1-5delims= " %%a IN (q27546207.txt) DO (
 IF DEFINED found_ssh_rsa (
  IF "%%a"=="ec2:" IF "%%e"=="(RSA)" SET "data=%%c"
 ) ELSE (
  IF "%%b"=="ssh-rsa" SET "found_ssh_rsa=Y"
 )
)
ECHO data found was "%data%"
GOTO :EOF

我使用了一个名为q27546207.txt的文件,其中包含我的测试数据。

您说要从第522行获得75:d9:e3:5b:c8:17:ef:72:92:78:e5:8e:0c:82:7e:e1,但您指定的信号为ssh-rsa。我假设您希望ssh-rsa后面的数据开始ec2:并结束(RSA) - 因为缺少具体信息。