我有一个包含数十万行的非常大的数据集。我设法将它分成两列,如下所示:
Name: | John
Birth year: | 1982
Favorite sport: | Rugby
Favorite color: | Blue
|
Name: | Mike
Birth year: | 1977
|
Name: | Shayla
Favorite sport: | Soccer
|
Name: | Kimberly
Birth year: | 1983
Favorite sport: | Baseball
Favorite color: | Yellow
Favorite food: | Pizza
但是,我想消除当前存在的类别的重复。如何将每个数据“条目”拆分为单独的行或列,并使用“类别”,以便不重复,如下所示:
Name | Birth year | Favorite sport | Favorite color | Favorite food
John | 1982 | Rugby | Blue |
Mike | 1977 | | |
Shayla | | Soccer | |
Kimberly| 1983 | Baseball | Yellow | Pizza
应该注意,现有数据集将包含名称加上一个或多个类别
答案 0 :(得分:3)
尝试这样的事情:
Set fields = CreateObject("Scripting.Dictionary")
fields.Add "Name:" , 1
fields.Add "Birth year:" , 2
fields.Add "Favorite sport:" , 3
fields.Add "Favorite color:" , 4
fields.Add "Favorite food:" , 5
Set xl = CreateObject("Excel.Application")
xl.Visible = True
Set wb = xl.Workbooks.Open("C:\path\to\your.xls")
Set src = wb.Sheets(1)
Set dst = wb.Sheets(2)
i = 0
For Each row In src.UsedRange.Rows
key = row.Range("A1").Value
val = row.Range("B1").Value
If key = "Name:" Then i = i+1
If key <> "" Then
If fields.Exists(key) Then
dst.Cells(i+1, fields(key)).Value = val
Else
WScript.Echo "Unknown key " & key
End If
End If
Next
wb.Save
wb.Close
xl.Quit