将文本文件转换为CSV文件

时间:2016-08-18 10:06:59

标签: csv powershell export-to-csv

我写了一些powershell代码来读取一个非常大的.txt文件,选择某些行并将它们放入CSV中。问题是该文件的格式如下:

// First you load the page at the top by adding a simple # at the end
$(document).ready(function() {
    var url = window.location.href;
    console.log(url);
    if( url.indexOf('#') < 0 ) {
        window.location.replace(url + "#");
    } else {
        window.location.replace(url);
    }
});
//Freeze page content scrolling while the page is loading
$(document).ready(function() {
    if($("html").css("position") != "fixed") {
        var top = $("html").scrollTop() ? $("html").scrollTop() : $("body").scrollTop();
        if(window.innerWidth > $("html").width()) {
            $("html").css("overflow-y", "scroll");
        }
        $("html").css({"width": "100%", "height": "100%", "position": "fixed", "top": -top});
    }
});
//Unfreeze page content scrolling when the page has finished loading
$(window).load(function() {
        if($("html").css("position") == "fixed") {
            $("html").css("position", "static");
            $("html, body").scrollTop(-parseInt($("html").css("top")));
            $("html").css({"position": "", "width": "", "height": "", "top": "", "overflow-y": ""});
        }
});

我需要将其转换为:

header1: Data1 
header2: Data1 
header3: Data1 
header4: Data1 
header1: Data2 
header2: Data2 
header3: Data2
header4: Data2

代码是这样的:

Header1,Header2,Header3,Header4
data1,data1,data1,data1
data2,data2,data2,data2

但我最终得到了这个(实际数据):

   $path = get-location
    $textfile = Get-FileName $env:USERPROFILE\Downloads\


    $writefile = "$path\data2.csv"
    $reader = [System.IO.File]::OpenText($textfile)
    $writer = New-Object System.IO.StreamWriter $writefile
    $writer.WriteLine('{0},{1},{2},{3}', "Policy","Schedule Type","Retention Level","Host")

        for(;;) {

                $line = $reader.ReadLine() #
                if ($null -eq $line) {
                break
                }

                $data = $line.Split(":")

                if ($null -ne $data[0]) {
                $newdata0 = $data[0].trimstart(" ")
                }
                if ($null -ne $data[1]) {
                $newdata1 = $data[1].trimstart(" ")
                }

                if ($newdata0 -eq "Policy")  {$writer.WriteLine('{0},{1},{2},{3}', $newdata1,$null,$null,$null)}

                if ($newdata0 -eq "Schedule Type") {$writer.WriteLine('{0},{1},{2},{3}', $null,$newdata1,$null,$null)}

                if ($newdata0 -eq "Retention Level") {$writer.WriteLine('{0},{1},{2},{3}', $null,$null,$newdata1,$null)}

                if ($newdata0 -eq "Host") {$writer.WriteLine('{0},{1},{2},{3}', $null,$null,$null,$newdata1)}    

            }



    $reader.Close()
    $writer.Close()

我认为我的代码错误或者只是需要找到重新格式化csv的方法?

2 个答案:

答案 0 :(得分:0)

我倾向于让某些东西寻找重复作为记录分隔符(替换行尾)。

$header = New-Object System.Collections.Generic.List[String]
Get-Content test.txt | Where-Object { $_ -match '(?<Header>[^:]+): *(?<Value>.+)$' } | ForEach-Object {
    if ($header.Contains($matches.Header)) {
        # End of record start again.
        $header.Clear()
        # Output
        $psObject
    }
    if ($header.Count -eq 0) {
        # Start of the record. Create an object to hold it.
        $psObject = New-Object PSObject
    }

    # Add the current header and value to the object.
    $psObject | Add-Member $matches.Header $matches.Value
    # Add the header name to the record controller
    $header.Add($matches.Header)    
}
# Output the last entry from the file (no end detection)
$psObject

答案 1 :(得分:0)

您的问题是每次调用$ writer.WriteLine正在推进您正在写入的目标文件中的行。您需要收集每个循环的信息,但每次循环只写一次,这可能有效:

  $loopCounter = 0

  for(;;) {

            $line = $reader.ReadLine() #
            if ($null -eq $line) {
            break
            }

            $data = $line.Split(":")

            if ($null -ne $data[0]) {
            $newdata0 = $data[0].trimstart(" ")
            }
            if ($null -ne $data[1]) {
            $newdata1 = $data[1].trimstart(" ")
            }

            if ($newdata0 -eq "Policy")  {$data1=$newdata}

            if ($newdata0 -eq "Schedule Type") {$data2=$newdata}

            if ($newdata0 -eq "Retention Level") {$data3=$newdata}

            if ($newdata0 -eq "Host") {$data4=$newdata}    


            if (($loopCounter % 4) -eq 3) {$writer.WriteLine('{0},{1},{2},{3}', $data1, $data2, $data3, $data4)}        

            $loopCounter++

        }