列出列标题并获取每列的最大字符串长度

时间:2017-05-25 19:46:49

标签: excel-vba powershell vbscript vba excel

我正在寻找一个以Powershell,vbscript或Excel VBA中的脚本形式翻译我的Excel公式。我正在尝试获取列标题列表以及其下的字符串的最大长度。

通常,我所做的是在Excel中手动打开.txt文件,从那里我可以得到标题名称..接下来,我创建一个数组公式 = MAX(LEN(A1:A100,000)) 例如。这将获得列中字符串的最大长度。我会对其他专栏做同样的公式。

现在我无法做到这一点,因为文件大小增加到1GB,我无法打开它们,我的桌面崩溃了。也许是因为它们超过100万行,Excel无法处理。我的朋友建议使用Powershell,但我的知识有限......不知道是否可以在vbscript或Excel VBA中完成。

提前感谢您的帮助。

下面的代码适用于.csv文件,但不适用于.txt分隔文件 -

$fileName = "C:\Desktop\EFile.csv"
<#
Sample format of c:\temp\data.csv
"id","name","grade","address"
"1","John","Grade-9","test1"
"2","Ben","Grade-9","test12222"
"3","Cathy","Grade-9","test134343"
#>
$colCount = (Import-Csv  $fileName | Get-Member | Where-Object {$_.MemberType -eq 'NoteProperty'} | Measure-Object).Count
$csv = Import-Csv $fileName 
$csvHeaders = ($csv | Get-Member -MemberType NoteProperty).name

$dict = @{}
foreach($header in $csvHeaders) {
    $dict.Add($header,0)
    }

foreach($row in $csv)
{
    foreach($header in $csvHeaders) 
    {
        if($dict[$header] -le ($row.$header).Length) 
        {
            $dict[$header] =($row.$header).Length
        }
    }
}
$dict.Keys | % { "key = $_ , Column Length = " + $dict.Item($_) }

1 个答案:

答案 0 :(得分:0)

这就是我获取数据的方式。

$data = @"
"id","name","grade","address"
"1","John","Grade-9","test1"
"2","Ben","Grade-9","test12222"
"3","Cathy","Grade-9","test134343"
"@
$csv = ConvertFrom-Csv -Delimiter ',' $data

但你应该得到这样的数据

$fileName = "C:\Desktop\EFile.csv"
$csv = Import-Csv -Path $fileName

然后

# Extract the header names
$headers = $csv | Get-Member -MemberType NoteProperty | Select-Object -ExpandProperty Name

# Capture output in $result variable
$result = foreach($header in $headers) {

    #                 Select all items in $header column,     find the longest,         and select the item for output
    $maximum = $csv | Select-Object -ExpandProperty $header | Measure-Object -Maximum | Select-Object -ExpandProperty Maximum

    # Generate new object holding the information. 
    # This will end up in $results
    [pscustomobject]@{
        Header = $header
        Max = $maximum.Length
        String = $maximum
    }
}


# Simple output
$result | Format-Table

这就是我得到的:

Header  Max String    
------  --- ------    
address  10 test134343
grade     7 Grade-9   
id        1 3         
name      4 John      

或者,如果您遇到处理大型文件的内存问题,则可能必须使用.NET框架更加脏。此片段一次处理一个csv行,而不是将整个文件读入内存。

$fileName = "$env:TEMP\test.csv"
$delimiter = ','

# Open a StreamReader
$reader = [System.IO.File]::OpenText($fileName)

# Read the headers and turn it into an array, and trim away any quotes
$headers = $reader.ReadLine() -split $delimiter | % { $_.Trim('"''') }

# Prepare a hashtable for the results
$result = @{}

# So long as there's more data, keep running
while(-not $reader.EndOfStream) {

    # Read a single line and process it as csv
    $csv = $reader.ReadLine() | ConvertFrom-Csv -Header $headers -Delimiter $delimiter

    # Determine if the item in the result hashtable is smaller than the current, using the header as a key
    foreach($header in $headers) {
        $item = $csv | Select-Object -ExpandProperty $header

        if($result[$header].Maximum -lt $item.Length) {
            $result[$header] = [pscustomobject]@{
                Header = $header
                Maximum = $item.Length
                String = $item
            }
        }
    }
}

# Clean up our spent resource
$reader.Close()

# Simple output
$result.Values | Format-Table