希望你做得好。 我需要一些帮助。例如,我有3个csv文件:
1)File1.csv有2条记录/行
firstname | lastname | city | country | emailaddress
-----------------------------------------------------
alexf sdfsd mumbai india sdf@sdf.com
----------
asfd sdfsdf toronto canada dfsd@sdf.com
2)secondfile.csv,包含2条记录/行
first-name | last-name | currentcity | currentcountry | email-address
----------------------------------------------------------------------
asdf sdfkjwl sydney australia sdf@dsffwe.com
----------
lskjdf sdlfkjlkj delhi india sdflkj@sdf.com
3)userfile.csv,包含2条记录/行
fname | lname | usercity | usercountry | email
-----------------------------------------------
sdf sdflj auckland new zealand sdf@sdf.com
----------
sdfsdf sdf venice italy sdf@dsf.com
现在我想创建一个单独的csv或excel或mysql或任何数据库表,其中我希望所有这些记录来自具有不同列/标题名称但具有相同类型数据的所有不同csv文件。像这样:
singlecsvfile.csv
first_name | last_name | city | country | email_address
--------------------------------------------------------
alexf sdfsd mumbai india sdf@sdf.com
asfd sdfsdf toronto canada dfsd@sdf.com
asdf sdfkjwl sydney australia sdf@dsffwe.com
lskjdf sdlfkjlkj delhi india sdflkj@sdf.com
sdf sdflj auckland new zealand sdf@sdf.com
sdfsdf sdf venice italy sdf@dsf.com
实际上,由于数据源的类型不同,我有50多个具有不同列名但具有相同类型数据的文件。你建议我做什么,你会建议什么策略或方式,我应该如何实现这一点。如果可能的话,请建议我简单的方法(excel / powerquery / powerBI)或代码(php / sql)。我需要快速或自动化的解决方案,如数据映射。我搜索了很多,但找不到任何解决方案。建议将不胜感激。感谢
答案 0 :(得分:2)
我会使用Power Query。每个输入文件布局都需要一个单独的Query。这些只会重命名列以匹配您的 singlecsvfile.csv 列名称。我会将每个设置为加载到/仅创建连接。
然后最终的 singlecsvfile 查询将使用附加查询来添加输入查询中的所有数据。 Power Query使用列名来组合Append中的数据 - 列的顺序(从左到右)无关紧要。
如果您的50多个文件中有任何常见布局,我会将它们分成子文件夹。然后,您可以使用单个输入查询翻录子文件夹中的所有文件 - 使用从文件/从文件夹开始
Power Query会将输出传递到Excel表格。如果您确实需要CSV输出,只需录制宏以刷新Power Query和另存为CSV。
答案 1 :(得分:-1)
SuperUser并不是一个真正的代码编写服务。话虽如此,我已经得到了一段基本上应该在vba中做你想要的代码。它有一些评论所以应该是可管理的。可能需要一些调整,具体取决于您的文件。
Option Explicit
Global first_sheet As Boolean
Global append As Boolean
Sub process_folder()
Dim book_counter As Integer
Dim folder_path As String
Dim pWB As Workbook, sWB As Workbook, sWB_name As String
Dim pWS As Worksheet
book_counter = 0
first_sheet = True
'Flag between appending in one sheet and copying into individual sheets
append = True
Set pWB = ActiveWorkbook
Set pWS = pWB.ActiveSheet
folder_path = "O:\Active\_2010\1193\10-1193-0015 Kennecott eagle\Phase 8500 - DFN Modelling\4. Analysis & Modelling\Phase 2 - DFN building\Export\fracture_properties\20140205"
folder_path = verify_folder(folder_path)
If folder_path = "NULL" Then
Exit Sub
End If
'Get first file to open
sWB_name = Dir(folder_path, vbNormal)
'Loop through files
Do While sWB_name <> ""
'Open each file
Workbooks.Open Filename:=folder_path & sWB_name
Set sWB = Workbooks(sWB_name)
Call process_workbook(pWB, sWB)
'close file increment counter
sWB_name = Dir()
book_counter = book_counter + 1
Loop
'Number of files processed
MsgBox ("Number of Fragment Files processed: " & book_counter)
End Sub
Sub process_workbook(pWB As Workbook, sWB As Workbook)
If append Then
Call append_all_sheets(pWB, sWB, 1)
Else
Call copy_all_sheets(pWB, sWB)
End If
End Sub
Sub copy_all_sheets(pWB As Workbook, sWB As Workbook)
Dim ws As Worksheet
For Each ws In sWB.Worksheets
ws.Move After:=pWB.Sheets(pWB.Sheets.count)
Next ws
End Sub
Sub append_all_sheets(pWB As Workbook, sWB As Workbook, headerRows As Long)
Dim lastCol As Long, lastRow As Long, pasteRow As Long
Dim count As Integer
Dim ws As Worksheet
For Each ws In sWB.Worksheets
lastCol = find_last_col(ws)
lastRow = find_last_row(ws)
pasteRow = find_last_row(pWB.Sheets(1))
'Copy entire data range if its the first sheet otherwise leave of the header row
If first_sheet Then
' ws.Range("A1").Resize(lastRow, lastCol).Copy
pWB.Sheets(1).Range("A" & pasteRow).Resize(lastRow, lastCol).Formula = ws.Range("A1").Resize(lastRow, lastCol).Formula
'Destination:=pWB.Sheets(1).pasteRow
Else
'pWB.Sheets(1).Formula = ws.Range("A1").Offset(headerRows, 0).Resize(lastRow - headerRows, lastCol).Formula
pWB.Sheets(1).Range("A" & pasteRow).Resize(lastRow - headerRows, lastCol).Formula = ws.Range("A1").Offset(headerRows, 0).Resize(lastRow - headerRows, lastCol).Formula
End If
first_sheet = False
Next ws
sWB.Close (False)
End Sub
Function find_last_row(ws As Worksheet) As Long
With ws
If Application.WorksheetFunction.CountA(.Cells) <> 0 Then
find_last_row = .Cells.Find(What:="*", _
After:=.Range("A1"), _
Lookat:=xlPart, _
LookIn:=xlFormulas, _
SearchOrder:=xlByRows, _
SearchDirection:=xlPrevious, _
MatchCase:=False).Row
Else
find_last_row = 1
End If
End With
End Function
Function find_last_col(ws As Worksheet) As Long
With ws
If Application.WorksheetFunction.CountA(.Cells) <> 0 Then
find_last_col = .Cells.Find(What:="*", _
After:=.Range("A1"), _
Lookat:=xlPart, _
LookIn:=xlFormulas, _
SearchOrder:=xlByColumns, _
SearchDirection:=xlPrevious, _
MatchCase:=False).Column
Else
find_last_col = 1
End If
End With
End Function
Function verify_folder(path As String) As String
If path = "" Then
MsgBox ("Enter the Directory of the Fragment simulation files to process")
verify_folder = "NULL"
Exit Function
End If
If Not PathExists(path) Then
MsgBox ("Directory does not exist")
verify_folder = "NULL"
Exit Function
End If
If Right(path, 1) <> "\" Then
verify_folder = path & "\"
End If
End Function
Function PathExists(pName) As Boolean
On Error Resume Next
PathExists = (GetAttr(pName) And vbDirectory) = vbDirectory
End Function