使用PowerShell解析逗号分隔文件

时间:2010-09-17 06:49:49

标签: .net regex powershell csv

我有一个包含多行的文本文件,每行都是逗号分隔的字符串。每行的格式为:

<Name, Value, Bitness, OSType>

BitnessOSType是可选的。

例如,文件可以是这样的:

Name1, Value1, X64, Windows7
Name2, Value2, X86, XP
Name3, Value3, X64, XP
Name4, Value3, , Windows7
Name4, Value3, X64 /*Note that no comma follows X64 */
....
....

我想将每一行解析为4个变量并对其执行一些操作。这是我使用的PowerShell脚本..

Get-Content $inputFile | ForEach-Object {
    $Line = $_;

    $_var = "";
    $_val = "";
    $_bitness = "";
    $_ostype = "";

    $envVarArr = $Line.Split(",");
    For($i=0; $i -lt $envVarArr.Length; $i++) {
        Switch ($i) {
            0 {$_var = $envVarArr[$i].Trim();}
            1 {$_val = $envVarArr[$i].Trim();}
            2 {$_bitness = $envVarArr[$i].Trim();}
            3 {$_ostype = $envVarArr[$i].Trim();}
        }                                    
    }
    //perform some operation using the 4 temporary variables
}

但是,我想知道是否可以在PowerShell中使用正则表达式来完成此操作。你能提供样本代码吗?请注意,每行中的第3个和第4个值可以选择为空。

4 个答案:

答案 0 :(得分:6)

您可以使用-Header cmdlet的Import-Csv参数为导入的文件文件指定备用列标题行:

Import-Csv .\test.txt -Header Col1,Col2,Bitness,OSType

答案 1 :(得分:3)

使用Import-Csv为你完成所有这些(并且更可靠)会不会更好?

答案 2 :(得分:3)

Tim建议,您可以使用Import-Csv。区别在于Import-Csv从文件中读取。

@"
Name1, Value1, X64, Windows7
Name2, Value2, X86, XP
Name3, Value3, X64, XP
Name4, Value3, , Windows7
Name4, Value3, X64 /*Note that no comma follows X64 */
"@ | ConvertFrom-Csv -header var, val, bitness, ostype

# Result

var   val    bitness                                 ostype  
---   ---    -------                                 ------  
Name1 Value1 X64                                     Windows7
Name2 Value2 X86                                     XP      
Name3 Value3 X64                                     XP      
Name4 Value3                                         Windows7
Name4 Value3 X64 /*Note that no comma follows X64 */         

答案 3 :(得分:0)

比糖蜜慢,但在花了20年时间拼凑了十几个或更多的部分解决方案后,我决定明确地解决这个问题。当然,现在可以使用各种解析器库了。


function SplitDelim($Line, $Delim=",", $Default=$Null, $Size=$Null) {

    # 4956968,"""Visible,"" 4D ""Human"" Torso Anatomy Kit (4-5/8)",FDV-26051,"" ,"",,,,,,a
    # "4956968,"""Visible,"" 4D ""Human"" Torso Anatomy Kit (4-5/8)",FDV-26051,"" ,"",,,,,,a
    # ,4956968,"""Visible,"" 4D ""Human"" Torso Anatomy Kit (4-5/8)",FDV-26051,"" ,"",,,,,,a

    $Field = ""
    $Fields = @()
    $Quotes = 0
    $State = 'INF' # INFIELD, INQFIELD, NOFIELD
    $NextState = $Null

    for ($i=0; $i -lt $Line.length; $i++) {
        $Char = $Line.substring($i,1)

        if($State -eq 'NOF') {

            # NOF and Char is Quote
            # NextState becomes INQ
            if ($Char -eq '"') {
                $NextState = 'INQ'
            }

            # NOF and Char is Delim
            # NextState becomes NOF
            elseif ($Char -eq $Delim) {
                $NextState = 'NOF'
                $Char = $Null
            }

            # NOF and Char is not Delim, Quote or space
            # NextState becomes INF
            elseif ($Char -ne " ") {
                $NextState = 'INF'
            }

        } elseif ($State -eq 'INF') {

            # INF and Char is Quote
            # Error
            if ($Char -eq '"') {
                return $Null}

            # INF and Char is Delim
            # NextState Becomes NOF
            elseif ($Char -eq $Delim) {
                $NextState = 'NOF'
                $Char = $Null
            }

        } elseif ($State -eq 'INQ') {

            # INQ and Char is Delim and consecutive Quotes mod 2 is 0
            # NextState is NOF
            if ($Char -eq $Delim -and $Quotes % 2 -eq 0) {
                $NextState = 'NOF'
                $Char = $Null
            }
        }

        # Track consecutive quote for purposes of mod 2 logic
        if ($Char -eq '"') {
            $Quotes++
        } elseif ($NextState -eq 'INQ') {
            $Quotes = 0
        }

        # Normal duty
        if ($State -ne 'NOF' -or $NextState -ne 'NOF') {
            $Field += $Char
        }

        # Push to $Fields and clear
        if ($NextState -eq 'NOF') {
            $Fields += (IfBlank $Field $Default)
            $Field = ''
        }

        if ($NextState) {
            $State = $NextState
            $NextState = $Null
        }
    }

    $Fields += (IfNull $Field $Default)

    while ($Size -and $Fields.count -lt $Size) {
        $Fields += $Default
    }

    return $Fields
}