您好我有一个csv文件,大约有380k行和树列。 A列 - 我有钥匙,每个装饰版本重复一次,每个装饰版本每次都用Brand开头 B栏 - 规范 列C - 修剪ID,每个修剪版本的编号相同
我的问题是我有不一致的数据范围,一些修剪版本有多达55行数据,其他只有5-6
(A)KEY (B)VALUE (C)TRIM ID
====== ======== ==========
1. Brand Mitsubishi 20001
2. Model ASX 20001
3. Trim ASX (facelift 2012) 20001
4. Engine 1.8 DI-D (114 Hp) 20001
5. Doors 5 20001
6. Power 114 hp 20001
7. Maximum speed 189 km/h 20001
8. From 0 to 100 km/h 10.2 sec 20001
9. Fuel tank volume 63 l 20001
10. Year into production 2012 20001
11. Seats 5 20001
1. Brand BMW 20015
2. Model M4 20015
3. Trim M4 (F83) 20015
4. Engine 3.0 (431 Hp) DCT 20015
5. Power 431 hp 20015
1. Brand AUDI 25003
2. Model A4 25003
3. Trim 1.9TDI AVANT SLINE 25003
4. Power 131 hp 25003
我希望将数据转置为每行调整一次并匹配数据。例如,每次找到Brand时,最新的行包含数据,其余数据与列名Brand,model ... seat等匹配。
像这样:
Brand Model Generation Engine Doors Power Maximum speed Seats Length
===== ===== ========= ===== ===== ===== ============= ===== ======
AUDI A4 2.0T SLINE 2.0T 5 210 220 4 4520
BMW M3 330 280 4
HONDA CIVIC 1.6i 4 160 4
我试图用函数解决这个问题,但我认为我需要vba脚本而且我不擅长这个。请帮助我。
答案 0 :(得分:0)
我认为这可以通过数据透视表轻松完成。只需将csv数据导入excel并将其转换为数据透视表即可。
答案 1 :(得分:0)
好吧,我这样做了(以及你提供的数据样本)
Sub createDataTable()
Dim r
Dim c
Dim i
Dim rng As Range
Dim newSht As Worksheet
Dim dataSht As Worksheet
Dim j 'the counter for the rows of the table
Dim colName As Range
Dim theAddress
Set dataSht = Sheets("Data")
dataSht.Activate
r = Range("A1").End(xlDown).Row 'take the last row of the data
c = Range("A1").End(xlToRight).Column 'Take the last columns of the data
Set rng = Range(Cells(2, 1), Cells(r, 1)) 'Store the column 1=A of the data
Sheets.Add After:=Sheets(Sheets.Count) 'Add a new sheet
Set newSht = ActiveSheet 'Store the new sheet int the var
newSht.Name = myTime 'Rename the new sheet with the function
j = 1
dataSht.Activate
rng.Copy
Range("H1").PasteSpecial xlPasteAll
Application.CutCopyMode = False
ActiveSheet.Range("$H:$H").RemoveDuplicates Columns:=1, Header:=xlNo
Range("H1", Selection.End(xlDown)).Copy
newSht.Activate
Range("A1").PasteSpecial Paste:=xlPasteAll, Transpose:=True
Rows("1:1").Font.Bold = True
dataSht.Activate 'Go to the new sheet (just in case)
Range("H1", Selection.End(xlDown)).ClearContents
For Each i In rng 'Here comes the magic
If i.Value = "Brand" Then 'If is a Brand set a new row
j = j + 1
End If
newSht.Activate 'Lets go to the new sheet
With newSht.Range("A1:BZ1") 'With the headers...
Set colName = .Find(i.Value, LookIn:=xlValues) '...Find the header of the column in that range
If Not colName Is Nothing Then 'If colName has something then
theAddress = colName.Address 'Put the address of the address just for reference
Do 'and inner loop
Range(Cells(j, colName.Column), Cells(j, colName.Column)).Value = i.Offset(0, 1).Value
'put the value of the field inside the cell below the right header in the right row
'col header = colName.column
'right row = j
'Set colName = .FindNext(colName) 'this is not necesary, because the header are unique
Loop While Not colName Is Nothing And colName.Address <> theAddress
End If
End With
dataSht.Activate
Next i
End Sub
编辑#1
此代码创建标题
Sub createDataTable()
Dim r
Dim c
Dim i
Dim rng As Range
Dim newSht As Worksheet
Dim dataSht As Worksheet
Dim j 'the counter for the rows of the table
Dim colName As Range
Dim theAddress
Set dataSht = Sheets("Data")
dataSht.Activate
'to create headers
Rows("1:1").Insert Shift:=xlDown
Range("A1").FormulaR1C1 = "Key"
Range("B1").FormulaR1C1 = "Value"
Range("C1").FormulaR1C1 = "Trim"
Rows("1:1").Font.Bold = True
r = Range("A1").End(xlDown).Row 'take the last row of the data
c = Range("A1").End(xlToRight).Column 'Take the last columns of the data
Set rng = Range(Cells(2, 1), Cells(r, 1)) 'Store the column 1=A of the data
Sheets.Add After:=Sheets(Sheets.Count) 'Add a new sheet
Set newSht = ActiveSheet 'Store the new sheet int the var
newSht.Name = myTime 'Rename the new sheet with the function
j = 1
dataSht.Activate
rng.Copy
Range("H1").PasteSpecial xlPasteAll
Application.CutCopyMode = False
ActiveSheet.Range("$H:$H").RemoveDuplicates Columns:=1, Header:=xlNo
Range("H1", Selection.End(xlDown)).Copy
newSht.Activate
Range("A1").PasteSpecial Paste:=xlPasteAll, Transpose:=True
Rows("1:1").Font.Bold = True
dataSht.Activate 'Go to the new sheet (just in case)
Range("H1", Selection.End(xlDown)).ClearContents
For Each i In rng 'Here comes the magic
If i.Value = "Brand" Then 'If is a Brand set a new row
j = j + 1
End If
newSht.Activate 'Lets go to the new sheet
With newSht.Range("A1:BZ1") 'With the headers...
Set colName = .Find(i.Value, LookIn:=xlValues) '...Find the header of the column in that range
If Not colName Is Nothing Then 'If colName has something then
theAddress = colName.Address 'Put the address of the address just for reference
Do 'and inner loop
Range(Cells(j, colName.Column), Cells(j, colName.Column)).Value = i.Offset(0, 1).Value
'put the value of the field inside the cell below the right header in the right row
'col header = colName.column
'right row = j
'Set colName = .FindNext(colName) 'this is not necesary, because the header are unique
Loop While Not colName Is Nothing And colName.Address <> theAddress
End If
End With
dataSht.Activate
Next i
End Sub
正如我在截图中看到的那样,您将代码放在工作表中,这将返回错误1004
因为您无法从工作表中“操纵”其他工作表。如果您需要/想要这样做,您需要在模块内部执行此操作,然后从该模块调用该过程。
在这种情况下,您需要添加一个新模块 在VBA中选择工作簿
Insert >>> Module
并在您的项目中添加一个新模块,并在该模块中添加编辑#2 的过程,然后使用F5
运行它。
如果您需要改进,请告诉我。
编辑#2
我很高兴帮助你......而你所犯的错误是因为我确实向你发送了自定义功能......对不起......好了!
Function myTime() As String
Dim HH
Dim MM
Dim SS
Dim TT
HH = Hour(Now)
MM = Minute(Now)
SS = Second(Now)
myTime = Format(HH, "00") & Format(MM, "00") & Format(SS, "00")
End Function
将此功能放在放置所有代码的同一模块中。
答案 2 :(得分:0)
我将回答我的问题,因为我找到了一个完美而强大的解决方案,称为OpenRefine,一个前谷歌项目(Google Refine)。
由于我的数据集现在超过毫安行,这是最快和最好的解决方案(比excel好得多)。