我们每天都会得到一个平面文本文件。有些日子,文件中的行需要在处理之前删除。这些行可以出现在不同的位置,但始终以字符6999或7999开头。我们希望运行一个删除这些特定行的脚本。然而,这超出了我的范围,任何有一条线路开始6999的地方,在它开始5442之前会有一条线也需要被删除,但只有它出现在6999线之前。
我们是Windows商店,会将此脚本作为Windows中简单批处理文件的一部分运行。我们不使用Unix或Linux也不希望。
文件扩展名反映日期。今天的文件是file.100621,明天的文件是file.100622。我在这方面遇到了麻烦,因为看起来vbscript不喜欢文件。*
以下是文本文件的示例:
4006006602 03334060000100580
40060066039 0334070000100580
700600000011571006210060001255863
544264287250111000025000000000040008000801
6999001000000000000000000000000000000000000000000000000000
6999001000000000000000000000000000000000000000000000000000
6999001000000000000000000000000000000000000000000000000000
799900000011571006210030000000000
8007000000115710062102530054008920
我们要删除此文件中的5行(5442行,3个6999行和7999行)。
以下是我在本网站上找到的脚本示例,已修改并取得了一些成功,但不知道删除行的方法(只知道如何替换行中的数据)。我意识到这需要进行大的修改或者需要完全抛弃,但我发布这个以提供我认为我们正在寻找的想法。我把它放在一个带有cscript.exe的目录中,然后从一个简单的批处理文件中调用它:
Set objFS = CreateObject("Scripting.FileSystemObject")
strFile = "c:\temp\file.100621"
Set objFile = objFS.OpenTextFile(strFile)
Do Until objFile.AtEndOfStream
strLine = objFile.ReadLine
If InStr(strLine,"6999")> 0 Then
strLine = Replace(strLine,"6999","delete line")
End If
WScript.Echo strLine
Loop
这让我知道了:
40060066039 0334070000100580
700600000011571006210060001255863
544264287250111000025000000000040008000801
delete line001000000000000000000000000000000000000000000000000000
delete line001000000000000000000000000000000000000000000000000000
delete line001000000000000000000000000000000000000000000000000000
799900000011571006210030000000000
8007000000115710062102530054008920
关闭!只需要删除行而不是写“删除行”。 所以这是基于我所知道的具体需求:
答案 0 :(得分:7)
我做了一些更改以尝试消除空行,我还添加了一个函数来循环输出文件并删除任何空行。希望这个有用。
Select Case Wscript.Arguments.Count
case 1:
strInput = GetFile(WScript.Arguments(0))
RemoveUnwantedLines strInput, strInput
RemoveBlankLines strInput
case 2:
strInput = GetFile(WScript.Arguments(0))
strOutput = Wscript.Arguments(1)
RemoveUnwantedLines strInput, strOutput
RemoveBlankLines strOutput
End Select
Function GetFile(strDirectory)
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFolder = objFSO.GetFolder(strDirectory)
dateLastModified = Null
strFile = ""
For Each objFile in objFolder.Files
If IsNull(dateLastModified) Then
dateLastModified = objFile.DateLastModified
strFile = objFile.Path
ElseIf dateLastModified < objFile.DateLastModified Then
dateLastModified = objFile.DateLastModified
strFile = objFile.Path
End If
Next
GetFile = strFile
End Function
Sub RemoveUnwantedLines(strInputFile, strOutputFile)
'Open the file for reading.
Set objFile = CreateObject("Scripting.FileSystemObject").OpenTextFile(strInputFile,1)
'Read the entire file into memory.
strFileText = objFile.ReadAll
'Close the file.
objFile.Close
'Split the file at the new line character. *Use the Line Feed character (Char(10))
arrFileText = Split(strFileText,Chr(10))
'Open the file for writing.
Set objFile = CreateObject("Scripting.FileSystemObject").OpenTextFile(strOutputFile,2,true)
'Loop through the array of lines looking for lines to keep.
For i = LBound(arrFileText) to UBound(arrFileText)
'If the line is not blank process it.
If arrFileText(i) <> "" Then
'If the line starts "5442", see if the next line is "6999".
If Left(arrFileText(i),4) = "5442" Then
'Make sure the next line exists (Don't want an out of bounds exception).
If i + 1 <= UBound(arrFileText)Then
'If the next line is not "6999"
If Left(arrFileText(i + 1), 4) <> "6999" Then
'Write the "5442" line to the file.
objFile.WriteLine(arrFileText(i))
End If
Else
'If the next line does not exist, write the "5442" line to the file (without a new line).
objFile.WriteLine(arrFileText(i))
End If
'If the line does not start with "6999" and the line does not start with "7999".
Elseif Left(arrFileText(i),4) <> "6999" AND Left(arrFileText(i),4) <> "7999" Then
'Write the line to the file.
objFile.WriteLine(arrFileText(i))
End If
End If
Next
'Close the file.
objFile.Close
Set objFile = Nothing
End Sub
Sub RemoveBlankLines(strInputFile)
Set objFile = CreateObject("Scripting.FileSystemObject").OpenTextFile(strInputFile,1)
'Read the entire file into memory.
strFileText = objFile.ReadAll
'Close the file.
objFile.Close
'Split the file at the new line character.
arrFileText = Split(strFileText,VbNewLine)
Set objFile = CreateObject("Scripting.FileSystemObject").OpenTextFile(strInputFile,2,true)
'Loop through the array of lines looking for lines to keep.
For i = LBound(arrFileText) to UBound(arrFileText)
'If the line is not blank.
if arrFileText(i) <> "" Then
'If there is another element.
if i + 1 <= UBound(arrFileText) Then
'If the next element is not blank.
if arrFileText(i + 1) <> "" Then
'Write the line to the file.
objFile.WriteLine(arrFileText(i))
Else
'Write the line to the file (Without a blank line).
objFile.Write(arrFileText(i))
End If
Else
'Write the line to the file (Without a blank line).
objFile.Write(arrFileText(i))
End If
End If
Next
'Close the file.
objFile.Close
Set objFile = Nothing
End Sub
要使用它,可以通过以下两种方式之一从命令行调用它。
RemoveUnwantedLines "C:\TestDirectory\" "C:\Output.txt"
或
RemoveUnwantedLines "C:\TestDirectory\"
答案 1 :(得分:1)
我认为这样可行(但我对VBS并不擅长,所以没有承诺):
Set objFS = CreateObject("Scripting.FileSystemObject")
strFile = "c:\temp\file.100621"
Set objFile = objFS.OpenTextFile(strFile)
Dim cachedLine
Do Until objFile.AtEndOfStream
strLine = objFile.ReadLine
If Len(cachedLine) > 0 And InStr(strLine,"6999") = 1 Then
WScript.Echo cachedLine
End If
cachedLine = ""
If InStr(strLine,"5442") = 1 Then
cachedLine = strLine
Else
If InStr(strLine,"6999") = 1 Or InStr(strLine,"7999") = 1 Then
' do nothing
Else
WScript.Echo strLine
End If
End If
Loop
请注意,我认为您正在检查行中是否包含数字,但您说规则是如果他们从数字开始,这就是我<> 1
而不是> 0
的原因。
答案 2 :(得分:0)
这将是我解决这个问题的伪算法:
(我宁愿教你如何解决它,而不是提供代码本身)
将文件用作参数(因此它可以灵活)或制作一个“假脱机程序”文件夹,该程序在运行时检查新内容,如邮件的“收件箱”。然后你还需要一个“发件箱”。这样您就可以处理文件,不知道它们的名称,并在处理时将它们移动到“发件箱”。
为此程序制作一个简单的“配置”文件。每行代表“过滤器”,如果需要,您也可以在行中添加动作。
7999删除
6999删除
5442删除
如[pattern] [action]
现在将配置读入“密钥”数组后,请检查“收件箱”中的文件。 对于每个文件,使用键集处理它。
处理文件“XXXXXXXXX.log”(或其他名称) 加载所有行,如果没有太多或readline来获取单行(取决于性能和内存使用情况)
对于每一行,请从字符串中获取前4个字母...
现在我们需要一行来解析:
sLine = left(readline(input filestream), 4)
因为我们只需要前4个字符来决定是否需要保留它。
如果这个“sLine”(字符串)在我们的过滤器/模式数组中,那么我们有匹配匹配...执行我们已配置的操作(在当前设置中 - delete = ignore line)。
6a上。如果忽略,则转到文本文件的下一行,转到#7
6b中。如果模式数组中没有匹配,那么我们有一条线要保留。将此内容写入OUTPUT流。
如果有更多行,则NEXT(转到#5)
关闭输入和输出文件。
从收件箱中删除/移动输入文件(可能是备份?)
如果目录[收件箱]中有更多文件,则接下来解析...转到#4
这不仅仅是纯粹的VBSCRIPT,而是任何语言的算法概念......
我希望你能看到我的想法,否则你只是评论它,我会尽力详细说明。希望我能给你一个很好的答案。
答案 3 :(得分:0)
好的,这是由Tester101精心组装的最终剧本。此脚本删除上面概述的不需要的行。它还处理每行末尾的换行符(我不知道)
Select Case Wscript.Arguments.Count
case 1:
strInput = GetFile(WScript.Arguments(0))
RemoveUnwantedLines strInput, strInput
RemoveBlankLines strInput
case 2:
strInput = GetFile(WScript.Arguments(0))
strOutput = Wscript.Arguments(1)
RemoveUnwantedLines strInput, strOutput
RemoveBlankLines strOutput
End Select
Function GetFile(strDirectory)
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFolder = objFSO.GetFolder(strDirectory)
dateLastModified = Null
strFile = ""
For Each objFile in objFolder.Files
If IsNull(dateLastModified) Then
dateLastModified = objFile.DateLastModified
strFile = objFile.Path
ElseIf dateLastModified < objFile.DateLastModified Then
dateLastModified = objFile.DateLastModified
strFile = objFile.Path
End If
Next
GetFile = strFile
End Function
Sub RemoveUnwantedLines(strInputFile, strOutputFile)
'Open the file for reading.
Set objFile = CreateObject("Scripting.FileSystemObject").OpenTextFile(strInputFile,1)
'Read the entire file into memory.
strFileText = objFile.ReadAll
'Close the file.
objFile.Close
'Split the file at the new line character. *Use the Line Feed character (Char(10))
arrFileText = Split(strFileText,Chr(10))
'Open the file for writing.
Set objFile = CreateObject("Scripting.FileSystemObject").OpenTextFile(strOutputFile,2,true)
'Loop through the array of lines looking for lines to keep.
For i = LBound(arrFileText) to UBound(arrFileText)
'If the line is not blank process it.
If arrFileText(i) <> "" Then
'If the line starts "5442", see if the next line is "6999".
If Left(arrFileText(i),4) = "5442" Then
'Make sure the next line exists (Don't want an out of bounds exception).
If i + 1 <= UBound(arrFileText)Then
'If the next line is not "6999"
If Left(arrFileText(i + 1), 4) <> "6999" Then
'Write the "5442" line to the file.
objFile.WriteLine(arrFileText(i))
End If
Else
'If the next line does not exist, write the "5442" line to the file (without a new line).
objFile.WriteLine(arrFileText(i))
End If
'If the line does not start with "6999" and the line does not start with "7999".
Elseif Left(arrFileText(i),4) <> "6999" AND Left(arrFileText(i),4) <> "7999" Then
'Write the line to the file.
objFile.WriteLine(arrFileText(i))
End If
End If
Next
'Close the file.
objFile.Close
Set objFile = Nothing
End Sub
Sub RemoveBlankLines(strInputFile)
Set objFile = CreateObject("Scripting.FileSystemObject").OpenTextFile(strInputFile,1)
'Read the entire file into memory.
strFileText = objFile.ReadAll
'Close the file.
objFile.Close
'Split the file at the new line character.
arrFileText = Split(strFileText,VbNewLine)
Set objFile = CreateObject("Scripting.FileSystemObject").OpenTextFile(strInputFile,2,true)
'Loop through the array of lines looking for lines to keep.
For i = LBound(arrFileText) to UBound(arrFileText)
'If the line is not blank.
if arrFileText(i) <> "" Then
'If there is another element.
if i + 1 <= UBound(arrFileText) Then
'If the next element is not blank.
if arrFileText(i + 1) <> "" Then
'Write the line to the file.
objFile.WriteLine(arrFileText(i))
Else
'Write the line to the file (Without a blank line).
objFile.Write(arrFileText(i))
End If
Else
'Write the line to the file (Without a blank line).
objFile.Write(arrFileText(i))
End If
End If
Next
'Close the file.
objFile.Close
Set objFile = Nothing
End Sub