从特定列下的目标csv中删除空格

时间:2013-06-09 16:23:53

标签: csv vbscript wsh

我想从理想情况下从特定列中删除csv文件中的空格,只留下单个空格而不是其他不需要的空格。我有以下脚本可以实现此目的但需要帮助实现以下脚本来检查特定列下的目标csv并删除空格。

这是脚本:

'Start by trimming leading/trailing spaces
str = Trim(str)

'Now, while we have 2 consecutive spaces, replace them
'with a single space...
Do While InStr(1, str, "  ")
str = Replace(str, "  ", " ")
Loop

理想情况下,我想像这样调用脚本:

Cscript whitespaceremover.vbs target.csv 'column_name'

3 个答案:

答案 0 :(得分:3)

我认为下面的例子可以完善,但我希望它能够开始使用。

我的演示CSV文件“target.csv”:

column_name1,column_name2,column_name3
abc 123, dfr 1145  wse, ht6
axv 358, dgt 2245  ekl, x7r
amn 772, fxw 7633  foo, pmn

示例“whitespaceremover.vbs”:

Const ForReading = 1, ForWriting = 2
Dim fso, file, column
Set fso = CreateObject("Scripting.FileSystemObject")
With WScript.Arguments
    If .Count <> 2 Then
        WScript.Echo "Error: Needs to arguments."
        WScript.Quit(-1)
    End If
    file   = .Item(0)
    column = .Item(1)
End With
If Not fso.FileExists(file) Then
    WScript.Echo "Error: File " & UCase(file) & " not found."
    WScript.Quit(-2)
End If
Dim csvFile, csvHeader, iColumn, idx
Set csvFile = fso.OpenTextFile(file, ForReading)
If Not csvFile.AtEndOfStream Then
    csvHeader = Split(csvFile.ReadLine, ",", -1, 1)
Else
    WScript.Echo "Error: File " & UCase(file) & " is empty."
    csvFile.Close
    WScript.Quit(-3)
End If
iColumn = -1
For idx = 0 To UBound(csvHeader)
    If csvHeader(idx) = column Then
        iColumn = idx
        Exit For
    End If
Next
If iColumn < 0 Then
    WScript.Echo "Error: column " & UCase(column) & " not found."
    csvFile.Close
    WScript.Quit(-4)
End If
Dim csvFile2, arLine, strLine
Set csvFile2 = fso.OpenTextFile(file & ".csv", ForWriting, True)
csvFile2.WriteLine Join(csvHeader, ",")
Do Until csvFile.AtEndOfStream
    strLine = Trim(csvFile.ReadLine)
    arLine  = Split(strLine, ",", -1, 1)
    Do While InStr(1, arLine(iColumn), "  ")
        arLine(iColumn) = Replace(arLine(iColumn), "  ", " ")
    Loop
    strLine = Join(arLine, ",")
    csvFile2.WriteLine strLine
Loop
csvFile.Close
csvFile2.Close
Set csvFile  = Nothing
Set csvFile2 = Nothing
Set fso = Nothing

结果(新文件“target.csv.csv”):

column_name1,column_name2,column_name3
abc 123, dfr 1145 wse, ht6
axv 358, dgt 2245 ekl, x7r
amn 772, fxw 7633 foo, pmn

P.S。有些人说我忘记发帖了。为了简化测试,我在第二列中添加了双倍空格。不久,要查看运行中的脚本使用column_name2作为命令行参数,即:

Cscript whitespaceremover.vbs target.csv column_name2

修改

在阅读Ansgar Wiechers关于Replace函数的评论后,我最终决定进行一些测试。与正则表达式相比,上面的代码可能很慢,但它的工作正常。这是我的证明示例:

str1 = "1" & Space(2) & "2" & Space(4) & "3" _
    & Space(1) & "4" & Space(6) & "5"
WScript.Echo "Original string: ", str1
Do While InStr(1, str1, "  ")
    str1 = Replace(str1, "  ", " ")
Loop
WScript.Echo "New string: ", str1
'Result>>
'Original string:  1  2    3 4      5
'New string:  1 2 3 4 5

答案 1 :(得分:1)

作为答案Panayot Karabakalov提供的附录:简单地用一个空格替换两个空格可能会在有3个或更多连续空格的序列时产生不希望的结果。像这样的一行:

foo   bar      baz

将被替换为:

foo  bar   baz

到此:

foo bar baz

原因是Replace在替换后的字符串后继续。例如,运行Replace("aaaa", "aa", "a")将首先替换前2个a字符:

aaaaaaa

然后在替换字符串后替换下两个a个字符

aaaaa

然后终止。

用于折叠空间(或一般的字符序列)的更强大的解决方案是用正则表达式替换:

Set re = New RegExp
re.Pattern = " +"  '<-- means "a sequence of one or more spaces"
re.Global = True

text = "foo   bar      baz"

WScript.Echo re.Replace(text, " ")

输出:

foo bar baz

答案 2 :(得分:0)

它没有面对所有指导方针,但也许对某人有帮助。

' USAGE: CScript WhiteSpaceRemover.vbs Target_File.csv Column_Number

Set oArgs = WScript.Arguments
If oArgs.Count = 2 Then
    strInputFileName = oArgs(0)
    intColumn = oArgs(1) - 1
    strOutputFileName = PrepareOutputPath(strInputFileName, "_new")

    WriteTextFile strOutputFileName, TrimCsv(ReadTextFile(strInputFileName), intColumn)

End If
Set oArgs = Nothing

Function TrimCsv (strFileContent, intColumn)
    ' usuwa niepotrzebne spacje w polach tabeli CSV
    strFileContent = Replace(strFileContent, vbCrLf, vbLf)
    arrFileContent = Split(strFileContent, vbLf)
    strFileContent = ""
    For Each strLine in arrFileContent
        If Not Len(strLine) = 0 Then
            arrRecord = Split(strLine, ";")

'           for specified column number
            arrRecord(intColumn) = Trim(arrRecord(intColumn))

'           for all columns
'           For iCount = LBound(arrRecord) To UBound(arrRecord)
'               arrRecord(iCount) = Trim(arrRecord(iCount))
'           Next

            AddToList strFileContent, Join(arrRecord, ";"), vbCrLf
            Erase arrRecord
        End If
    Next
    TrimCsv = strFileContent
    Erase arrFileContent
End Function


Function PrepareOutputPath(strFileName, strSuffix)
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    With objFSO
        strPath = .GetParentFolderName(strFileName)
        strName = .GetBaseName(strFileName)
        strExt = .GetExtensionName(strFileName)
    End With
    PrepareOutputPath = AddToList(strPath, strName & strSuffix, "\")
    PrepareOutputPath = AddToList(PrepareOutputPath, strExt, ".")
    Set objFSO = Nothing
End Function


Function AddToList(strList, strValue, strDelim)
    ' add delimiter between values
    If strList = "" Then
        AddToList = strValue
    Else
        AddToList = strList & strDelim & strValue
    End If
    strList = AddToList
End Function 


Function ReadTextFile(strFileName)
    Set objStream = CreateObject("ADODB.Stream")
    objStream.CharSet = "utf-8"

    objStream.Open
    objStream.LoadFromFile(strFileName)
    ReadTextFile = objStream.ReadText()
    objStream.Close

    Set objStream = Nothing
End Function


Sub WriteTextFile (strFileName, strFileContent)
    adSaveCreateNotExist = 1
    adSaveCreateOverWrite = 2
    adWriteChar = 0
    adWriteLine = 1

    Set objStream = CreateObject("ADODB.Stream")
    objStream.CharSet = "utf-8"

    objStream.Open
    objStream.WriteText strFileContent, adWriteChar
    objStream.SaveToFile strFileName, adSaveCreateOverwrite
    objStream.Close

    Set objStream = Nothing
End Sub

最好的问候

-

Pawel L。