将不一致的数据集从列转换为行csv或excel

时间:2016-01-21 14:23:41

标签: excel vba excel-vba csv transpose

您好我有一个csv文件,大约有380k行和树列。 A列 - 我有钥匙,每个装饰版本重复一次,每个装饰版本每次都用Brand开头 B栏 - 规范 列C - 修剪ID,每个修剪版本的编号相同

我的问题是我有不一致的数据范围,一些修剪版本有多达55行数据,其他只有5-6

(A)KEY                              (B)VALUE            (C)TRIM ID
======                              ========            ==========
1. Brand                            Mitsubishi          20001
2. Model                            ASX                 20001
3. Trim                             ASX (facelift 2012) 20001
4. Engine                           1.8 DI-D (114 Hp)   20001
5. Doors                            5                   20001
6. Power                            114 hp              20001
7. Maximum speed                    189 km/h            20001
8. From 0 to 100 km/h               10.2 sec            20001
9. Fuel tank volume                 63 l                20001
10. Year into production            2012                20001
11. Seats                           5                   20001                       
1. Brand                            BMW                 20015
2. Model                            M4                  20015
3. Trim                             M4 (F83)            20015
4. Engine                           3.0 (431 Hp) DCT    20015
5. Power                            431 hp              20015
1. Brand                            AUDI                25003
2. Model                            A4                  25003
3. Trim                             1.9TDI AVANT SLINE  25003
4. Power                            131 hp              25003

我希望将数据转置为每行调整一次并匹配数据。例如,每次找到Brand时,最新的行包含数据,其余数据与列名Brand,model ... seat等匹配。

像这样:

Brand   Model   Generation  Engine  Doors   Power   Maximum speed   Seats   Length
=====   =====    =========   =====  =====   =====   =============   =====   ======
AUDI    A4      2.0T SLINE  2.0T    5       210     220             4         4520
BMW     M3                  330                     280             4
HONDA   CIVIC               1.6i    4       160                     4

我试图用函数解决这个问题,但我认为我需要vba脚本而且我不擅长这个。请帮助我。

3 个答案:

答案 0 :(得分:0)

我认为这可以通过数据透视表轻松完成。只需将csv数据导入excel并将其转换为数据透视表即可。

答案 1 :(得分:0)

好吧,我这样做了(以及你提供的数据样本)

Sub createDataTable()
    Dim r
    Dim c
    Dim i
    Dim rng As Range
    Dim newSht As Worksheet
    Dim dataSht As Worksheet
    Dim j 'the counter for the rows of the table
    Dim colName As Range
    Dim theAddress

    Set dataSht = Sheets("Data")
    dataSht.Activate
    r = Range("A1").End(xlDown).Row 'take the last row of the data
    c = Range("A1").End(xlToRight).Column 'Take the last columns of the data
    Set rng = Range(Cells(2, 1), Cells(r, 1)) 'Store the column 1=A of the data

    Sheets.Add After:=Sheets(Sheets.Count) 'Add a new sheet
    Set newSht = ActiveSheet 'Store the new sheet int the var
    newSht.Name = myTime 'Rename the new sheet with the function
    j = 1

    dataSht.Activate
    rng.Copy
    Range("H1").PasteSpecial xlPasteAll
    Application.CutCopyMode = False

    ActiveSheet.Range("$H:$H").RemoveDuplicates Columns:=1, Header:=xlNo

    Range("H1", Selection.End(xlDown)).Copy
    newSht.Activate
    Range("A1").PasteSpecial Paste:=xlPasteAll, Transpose:=True
    Rows("1:1").Font.Bold = True

    dataSht.Activate 'Go to the new sheet (just in case)
    Range("H1", Selection.End(xlDown)).ClearContents

    For Each i In rng 'Here comes the magic
        If i.Value = "Brand" Then 'If is a Brand set a new row
            j = j + 1
        End If
        newSht.Activate 'Lets go to the new sheet
        With newSht.Range("A1:BZ1") 'With the headers...
            Set colName = .Find(i.Value, LookIn:=xlValues) '...Find the header of the column in that range
            If Not colName Is Nothing Then 'If colName has something then
                theAddress = colName.Address 'Put the address of the address just for reference
                Do 'and inner loop
                    Range(Cells(j, colName.Column), Cells(j, colName.Column)).Value = i.Offset(0, 1).Value
                    'put the value of the field inside the cell below the right header in the right row
                    'col header = colName.column
                    'right row = j
                    'Set colName = .FindNext(colName) 'this is not necesary, because the header are unique
                Loop While Not colName Is Nothing And colName.Address <> theAddress
            End If
        End With
        dataSht.Activate
    Next i
End Sub

编辑#1

此代码创建标题

Sub createDataTable()
    Dim r
    Dim c
    Dim i
    Dim rng As Range
    Dim newSht As Worksheet
    Dim dataSht As Worksheet
    Dim j 'the counter for the rows of the table
    Dim colName As Range
    Dim theAddress

    Set dataSht = Sheets("Data")
    dataSht.Activate
    'to create headers
    Rows("1:1").Insert Shift:=xlDown
    Range("A1").FormulaR1C1 = "Key"
    Range("B1").FormulaR1C1 = "Value"
    Range("C1").FormulaR1C1 = "Trim"
    Rows("1:1").Font.Bold = True

    r = Range("A1").End(xlDown).Row 'take the last row of the data
    c = Range("A1").End(xlToRight).Column 'Take the last columns of the data
    Set rng = Range(Cells(2, 1), Cells(r, 1)) 'Store the column 1=A of the data

    Sheets.Add After:=Sheets(Sheets.Count) 'Add a new sheet
    Set newSht = ActiveSheet 'Store the new sheet int the var
    newSht.Name = myTime 'Rename the new sheet with the function
    j = 1

    dataSht.Activate
    rng.Copy
    Range("H1").PasteSpecial xlPasteAll
    Application.CutCopyMode = False

    ActiveSheet.Range("$H:$H").RemoveDuplicates Columns:=1, Header:=xlNo

    Range("H1", Selection.End(xlDown)).Copy
    newSht.Activate
    Range("A1").PasteSpecial Paste:=xlPasteAll, Transpose:=True
    Rows("1:1").Font.Bold = True

    dataSht.Activate 'Go to the new sheet (just in case)
    Range("H1", Selection.End(xlDown)).ClearContents

    For Each i In rng 'Here comes the magic
        If i.Value = "Brand" Then 'If is a Brand set a new row
            j = j + 1
        End If
        newSht.Activate 'Lets go to the new sheet
        With newSht.Range("A1:BZ1") 'With the headers...
            Set colName = .Find(i.Value, LookIn:=xlValues) '...Find the header of the column in that range
            If Not colName Is Nothing Then 'If colName has something then
                theAddress = colName.Address 'Put the address of the address just for reference
                Do 'and inner loop
                    Range(Cells(j, colName.Column), Cells(j, colName.Column)).Value = i.Offset(0, 1).Value
                    'put the value of the field inside the cell below the right header in the right row
                    'col header = colName.column
                    'right row = j
                    'Set colName = .FindNext(colName) 'this is not necesary, because the header are unique
                Loop While Not colName Is Nothing And colName.Address <> theAddress
            End If
        End With
        dataSht.Activate
    Next i
End Sub

正如我在截图中看到的那样,您将代码放在工作表中,这将返回错误1004

enter image description here

因为您无法从工作表中“操纵”其他工作表。如果您需要/想要这样做,您需要在模块内部执行此操作,然后从该模块调用该过程。

在这种情况下,您需要添加一个新模块 在VBA中选择工作簿

Insert >>> Module

并在您的项目中添加一个新模块,并在该模块中添加编辑#2 的过程,然后使用F5运行它。

如果您需要改进,请告诉我。

编辑#2

我很高兴帮助你......而你所犯的错误是因为我确实向你发送了自定义功能......对不起......好了!

Function myTime() As String
    Dim HH
    Dim MM
    Dim SS
    Dim TT
    HH = Hour(Now)
    MM = Minute(Now)
    SS = Second(Now)
    myTime = Format(HH, "00") & Format(MM, "00") & Format(SS, "00")
End Function

将此功能放在放置所有代码的同一模块中。

答案 2 :(得分:0)

我将回答我的问题,因为我找到了一个完美而强大的解决方案,称为OpenRefine,一个前谷歌项目(Google Refine)。

由于我的数据集现在超过毫安行,这是最快和最好的解决方案(比excel好得多)。

http://openrefine.org/