我有多个分隔的文本文件(例如.csv
个文件),每个文件包含列,行和标题。
我希望尽可能轻松地将所有这些输入文件导入SQL Server。具体来说,我想创建输出表,我将动态导入这些文件 。
其中一些输入文件需要导入到同一个输出表中,而其他输入文件则需要导入到不同的表中。您可以假设将导入同一个表的所有文件都具有相同的标题。
SQL Server Management Studio有一个导入向导,它允许您导入带分隔符的文本文件(和其他格式)并自动创建输出表。但是,这不允许您同时导入多个文件。此外,它需要大量的手工工作,并且不可复制。
可以在网上找到许多将多个文本文件导入表格的脚本。但是,其中大多数都需要首先创建输出表。这也需要每个表额外的工作。
有没有办法列出所有相关的输入文件及其相应的输出表,以便自动创建表,然后导入数据?
答案 0 :(得分:4)
此脚本允许您将多个分隔文本文件导入 SQL数据库。导入数据的表(包括所有必需的列)将自动创建。该脚本包含一些文档。
/*
** This file was created by Laurens Bogaardt, Advisor Data Analytics at EY Amsterdam on 2016-11-03.
** This script allows you to import multiple delimited text files into a SQL database. The tables
** into which the data is imported, including all required columns, are created automatically. This
** script uses tab-delimited (tsv) files and SQL Server Management Studio. The script may need some
** minor adjustments for other formats and tools. The scripts makes several assumptions which need
** to be valid before it can run properly. First of all, it assumes none of the output tables exist
** in the SQL tool before starting. Therefore, it may be necessary to clean the database and delete
** all the existing tables. Secondly, the script assumes that, if multiple text files are imported
** into the same output table, the number and order of the columns of these files is identical. If
** this is not the case, some manual work may need to be done to the text files before importing.
** Finally, please note that this script only imports data as strings (to be precise, as NVARCHAR's
** of length 255). It does not allow you to specify the datatype per column. This would need to be
** done using another script after importing the data as strings.
*/
-- 1. Import Multiple Delimited Text Files into a SQL Database
-- 1.1 Define the path to the input and define the terminators
/*
** In this section, some initial parameters are set. Obviously, the 'DatabaseName' refers to the
** database in which you want to create new tables. The '@Path' parameter sets the folder in
** which the text files are located which you want to import. Delimited files are defined by
** two characters: one which separates columns and one which separates rows. Usually, the
** row-terminator is the newline character CHAR(10), also given by '\n'. When files are created
** in Windows, the row-terminator often includes a carriage return CHAR(13), also given by '\r\n'.
** Often, a tab is used to separate each column. This is given by CHAR(9) or by the character '\t'.
** Other useful characters include the comma CHAR(44), the semi-colon CHAR(59) and the pipe
** CHAR(124).
*/
USE [DatabaseName]
DECLARE @Path NVARCHAR(255) = 'C:\\PathToFiles\\'
DECLARE @RowTerminator NVARCHAR(5) = CHAR(13) + CHAR(10)
DECLARE @ColumnTerminator NVARCHAR(5) = CHAR(9)
-- 1.2 Define the list of input and output in a temporary table
/*
** In this section, a temporary table is created which lists all the filenames of the delimited
** files which need to be imported, as well as the names of the tables which are created and into
** which the data is imported. Multiple files may be imported into the same output table. Each row
** is prepended with an integer which increments up starting from 1. It is essential that this
** number follows this logic. The temporary table is deleted at the end of this script.
*/
IF OBJECT_ID('[dbo].[Files_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Files_Temporary];
CREATE TABLE [dbo].[Files_Temporary]
(
[ID] INT
, [FileName] NVARCHAR(255)
, [TableName] NVARCHAR(255)
);
INSERT INTO [dbo].[Files_Temporary] SELECT 1, 'MyFileA.txt', 'NewTable1'
INSERT INTO [dbo].[Files_Temporary] SELECT 2, 'MyFileB.txt', 'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 3, 'MyFileC.tsv', 'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 4, 'MyFileD.csv', 'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 5, 'MyFileE.dat', 'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 6, 'MyFileF', 'NewTable3'
INSERT INTO [dbo].[Files_Temporary] SELECT 7, 'MyFileG.text', 'NewTable4'
INSERT INTO [dbo].[Files_Temporary] SELECT 8, 'MyFileH.txt', 'NewTable5'
INSERT INTO [dbo].[Files_Temporary] SELECT 9, 'MyFileI.txt', 'NewTable5'
INSERT INTO [dbo].[Files_Temporary] SELECT 10, 'MyFileJ.txt', 'NewTable5'
INSERT INTO [dbo].[Files_Temporary] SELECT 11, 'MyFileK.txt', 'NewTable6'
-- 1.3 Loop over the list of input and output and import each file to the correct table
/*
** In this section, the 'WHILE' statement is used to loop over all input files. A counter is defined
** which starts at '1' and increments with each iteration. The filename and tablename are retrieved
** from the previously defined temporary table. The next step of the script is to check whether the
** output table already exists or not.
*/
DECLARE @Counter INT = 1
WHILE @Counter <= (SELECT COUNT(*) FROM [dbo].[Files_Temporary])
BEGIN
PRINT 'Counter is ''' + CONVERT(NVARCHAR(5), @Counter) + '''.'
DECLARE @FileName NVARCHAR(255)
DECLARE @TableName NVARCHAR(255)
DECLARE @Header NVARCHAR(MAX)
DECLARE @SQL_Header NVARCHAR(MAX)
DECLARE @CreateHeader NVARCHAR(MAX) = ''
DECLARE @SQL_CreateHeader NVARCHAR(MAX)
SELECT @FileName = [FileName], @TableName = [TableName] FROM [dbo].[Files_Temporary] WHERE [ID] = @Counter
IF OBJECT_ID('[dbo].[' + @TableName + ']', 'U') IS NULL
BEGIN
/*
** If the output table does not yet exist, it needs to be created. This requires the list of all
** columnnames for that table to be retrieved from the first line of the text file, which includes
** the header. A piece of SQL code is generated and executed which imports the header of the text
** file. A second temporary table is created which stores this header as a single string.
*/
PRINT 'Creating new table with name ''' + @TableName + '''.'
IF OBJECT_ID('[dbo].[Header_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Header_Temporary];
CREATE TABLE [dbo].[Header_Temporary]
(
[Header] NVARCHAR(MAX)
);
SET @SQL_Header = '
BULK INSERT [dbo].[Header_Temporary]
FROM ''' + @Path + @FileName + '''
WITH
(
FIRSTROW = 1,
LASTROW = 1,
MAXERRORS = 0,
FIELDTERMINATOR = ''' + @RowTerminator + ''',
ROWTERMINATOR = ''' + @RowTerminator + '''
)'
EXEC(@SQL_Header)
SET @Header = (SELECT TOP 1 [Header] FROM [dbo].[Header_Temporary])
PRINT 'Extracted header ''' + @Header + ''' for table ''' + @TableName + '''.'
/*
** The columnnames in the header are separated using the column-terminator. This can be used to loop
** over each columnname. A new piece of SQL code is generated which will create the output table
** with the correctly named columns.
*/
WHILE CHARINDEX(@ColumnTerminator, @Header) > 0
BEGIN
SET @CreateHeader = @CreateHeader + '[' + LTRIM(RTRIM(SUBSTRING(@Header, 1, CHARINDEX(@ColumnTerminator, @Header) - 1))) + '] NVARCHAR(255), '
SET @Header = SUBSTRING(@Header, CHARINDEX(@ColumnTerminator, @Header) + 1, LEN(@Header))
END
SET @CreateHeader = @CreateHeader + '[' + @Header + '] NVARCHAR(255)'
SET @SQL_CreateHeader = 'CREATE TABLE [' + @TableName + '] (' + @CreateHeader + ')'
EXEC(@SQL_CreateHeader)
END
/*
** Finally, the data from the text file is imported into the newly created table. The first line,
** including the header information, is skipped. If multiple text files are imported into the same
** output table, it is essential that the number and the order of the columns is identical, as the
** table will only be created once, using the header information of the first text file.
*/
PRINT 'Inserting data from ''' + @FileName + ''' to ''' + @TableName + '''.'
DECLARE @SQL NVARCHAR(MAX)
SET @SQL = '
BULK INSERT [dbo].[' + @TableName + ']
FROM ''' + @Path + @FileName + '''
WITH
(
FIRSTROW = 2,
MAXERRORS = 0,
FIELDTERMINATOR = ''' + @ColumnTerminator + ''',
ROWTERMINATOR = ''' + @RowTerminator + '''
)'
EXEC(@SQL)
SET @Counter = @Counter + 1
END;
-- 1.4 Cleanup temporary tables
/*
** In this section, the temporary tables which were created and used by this script are deleted.
** Alternatively, the script could have used 'real' temporary table (identified by the '#' character
** in front of the name) or a table variable. These would have deleted themselves once they were no
** longer in use. However, the end result is the same.
*/
IF OBJECT_ID('[dbo].[Files_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Files_Temporary];
IF OBJECT_ID('[dbo].[Header_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Header_Temporary];
答案 1 :(得分:1)
注意:不要害怕这里会显示的冗长脚本。只有3个变量需要更改,整个脚本应该可以正常工作。
此解决方案是从接受的答案(@LBogaardt)升级而来的,它还实现了@Chendur Mar的建议以从文件夹中获取所有文件。
我的添加内容:
// [[Rcpp::plugins(cpp11)]]
#include <Rcpp.h>
#include <memory> // unique_ptr
#include <algorithm> // fill
using namespace Rcpp;
void calculate(const double *d, double *w, int col, int x) {
for (int i = 0; i < col; i++)
for (int j = 0; j < x; j++)
w[j * col + i]++;
}
// [[Rcpp::export]]
void show_example(){
/* Rcpp version */
int const col = 2,
row = 6,
x = 5,
// y = 3, never used?
a = 0,
n_out = (row - a) * col * x;
NumericVector w_rcpp(n_out);
for(int i = 0; i < row - a; i++)
calculate(nullptr /* never used? */, w_rcpp.begin() + i * col * x, col, x);
/* old C++ version. (uses std::unique_ptr to take care of memory
* allocation) */
std::unique_ptr<double[]> w_cpp(new double[n_out]);
/* fill with zeros to get the same*/
std::fill(w_cpp.get(), w_cpp.get() + n_out, 0);
for (int i = 0; i < row - a; ++i)
calculate(nullptr /* never used? */, w_cpp.get() + i * col * x, col, x);
/* Compare the result */
Rcpp::Rcout << "Rcpp: " << w_rcpp << '\n'
<< "Cpp: ";
for(int i = 0; i < n_out; ++i)
Rcpp::Rcout << *(w_cpp.get() + i) << ' ';
Rcpp::Rcout << '\n';
}
/*** R
show_example()
#R> Rcpp: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#R> Cpp: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
*/
而不是NVARCHAR(MAX)
-如果需要,您可以更改有关具体操作,请参见here。
请注意-导入文件夹是服务器上的远程文件夹。因此,您需要在服务器上创建文件夹并将文件上传到那里。 在this之后,对该文件夹设置权限。
您只需更改前4行:
行-代替NVARCHAR(255)
,输入数据库名称
行-定义.txt .csv文件所在的导入文件夹的位置
行-定义最有可能是yourDatabase
(new line
)的行终止符,因此请按原样
行-为文件定义分隔符-如果使用逗号而不是放置\n
或CHAR(44)
。 CHAR(9)是TAB。
脚本:
','
最后禁用USE yourDatabase
DECLARE @Location NVARCHAR(MAX) = 'C:\Users\username\Desktop\Import\';
DECLARE @RowTerminator NVARCHAR(5) = '\n';
DECLARE @ColumnTerminator NVARCHAR(5) = CHAR(9);
DECLARE @SQLINSERT NVARCHAR(MAX);
-- 1.2 Define the list of input and output in a temporary table
/*
** In this section, a temporary table is created which lists all the filenames of the delimited
** files which need to be imported, as well as the names of the tables which are created and into
** which the data is imported. Multiple files may be imported into the same output table. Each row
** is prepended with an integer which increments up starting from 1. It is essential that this
** number follows this logic. The temporary table is deleted at the end of this script.
*/
IF OBJECT_ID('[dbo].[Files_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Files_Temporary];
CREATE TABLE [dbo].[Files_Temporary]
(
[ID] INT identity (1,1) primary key
, [FileName] NVARCHAR(max)
, [TableName] NVARCHAR(max)
);
--insert names into [dbo].[Files_Temporary]
SET @SQLINSERT = 'INSERT INTO [dbo].[Files_Temporary] (filename) exec master.dbo.xp_cmdshell' + char(39) + ' dir ' + @Location + ' /b /a-d' + char(39)
EXEC(@SQLINSERT)
------Update table names eliminating the file extension-------
update [dbo].[Files_Temporary] set [TableName]= SUBSTRING(filename,0, CHARINDEX('.',filename))
-- 1.3 Loop over the list of input and output and import each file to the correct table
/*
** In this section, the 'WHILE' statement is used to loop over all input files. A counter is defined
** which starts at '1' and increments with each iteration. The filename and tablename are retrieved
** from the previously defined temporary table. The next step of the script is to check whether the
** output table already exists or not.
*/
DECLARE @Counter INT = 1
WHILE @Counter <= (SELECT COUNT(*) FROM [dbo].[Files_Temporary])
BEGIN
PRINT 'Counter is ''' + CONVERT(NVARCHAR(5), @Counter) + '''.'
DECLARE @FileName NVARCHAR(MAX)
DECLARE @TableName NVARCHAR(MAX)
DECLARE @Header NVARCHAR(MAX)
DECLARE @SQL_Header NVARCHAR(MAX)
DECLARE @CreateHeader NVARCHAR(MAX) = ''
DECLARE @SQL_CreateHeader NVARCHAR(MAX)
SELECT @FileName = [FileName], @TableName = [TableName] FROM [dbo].[Files_Temporary] WHERE [ID] = @Counter
IF OBJECT_ID('[dbo].[' + @TableName + ']', 'U') IS NULL
BEGIN
/*
** If the output table does not yet exist, it needs to be created. This requires the list of all
** columnnames for that table to be retrieved from the first line of the text file, which includes
** the header. A piece of SQL code is generated and executed which imports the header of the text
** file. A second temporary table is created which stores this header as a single string.
*/
PRINT 'Creating new table with name ''' + @TableName + '''.'
IF OBJECT_ID('[dbo].[Header_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Header_Temporary];
CREATE TABLE [dbo].[Header_Temporary]
(
[Header] NVARCHAR(MAX)
);
SET @SQL_Header = '
BULK INSERT [dbo].[Header_Temporary]
FROM ''' + @Location + @FileName + '''
WITH
(
FIRSTROW = 1,
LASTROW = 1,
MAXERRORS = 0,
FIELDTERMINATOR = ''' + @RowTerminator + ''',
ROWTERMINATOR = ''' + @RowTerminator + '''
)'
EXEC(@SQL_Header)
SET @Header = (SELECT TOP 1 [Header] FROM [dbo].[Header_Temporary])
PRINT 'Extracted header ''' + @Header + ''' for table ''' + @TableName + '''.'
/*
** The columnnames in the header are separated using the column-terminator. This can be used to loop
** over each columnname. A new piece of SQL code is generated which will create the output table
** with the correctly named columns.
*/
WHILE CHARINDEX(@ColumnTerminator, @Header) > 0
BEGIN
SET @CreateHeader = @CreateHeader + '[' + LTRIM(RTRIM(SUBSTRING(@Header, 1, CHARINDEX(@ColumnTerminator, @Header) - 1))) + '] NVARCHAR(MAX), '
SET @Header = SUBSTRING(@Header, CHARINDEX(@ColumnTerminator, @Header) + 1, LEN(@Header))
END
SET @CreateHeader = @CreateHeader + '[' + @Header + '] NVARCHAR(MAX)'
SET @SQL_CreateHeader = 'CREATE TABLE [ESCO].[' + @TableName + '] (' + @CreateHeader + ')'
EXEC(@SQL_CreateHeader)
END
/*
** Finally, the data from the text file is imported into the newly created table. The first line,
** including the header information, is skipped. If multiple text files are imported into the same
** output table, it is essential that the number and the order of the columns is identical, as the
** table will only be created once, using the header information of the first text file.
*/
--bulk insert
PRINT 'Inserting data from ''' + @FileName + ''' to ''' + @TableName + '''.'
DECLARE @SQL NVARCHAR(MAX)
SET @SQL = '
BULK INSERT [dbo].[' + @TableName + ']
FROM ''' + @Location + @FileName + '''
WITH
(
FIRSTROW = 2,
MAXERRORS = 0,
FIELDTERMINATOR = ''' + @ColumnTerminator + ''',
ROWTERMINATOR = ''' + @RowTerminator + ''',
CODEPAGE = ''65001'',
DATAFILETYPE = ''Char'',
ERRORFILE = ''' + @Location + 'ImportLog.log''
)'
EXEC(@SQL)
SET @Counter = @Counter + 1
END;
-- 1.4 Cleanup temporary tables
/*
** In this section, the temporary tables which were created and used by this script are deleted.
** Alternatively, the script could have used 'real' temporary table (identified by the '#' character
** in front of the name) or a table variable. These would have deleted themselves once they were no
** longer in use. However, the end result is the same.
*/
IF OBJECT_ID('[dbo].[Files_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Files_Temporary];
IF OBJECT_ID('[dbo].[Header_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Header_Temporary];
并删除“导入”文件夹。
答案 2 :(得分:0)
如果我是你,我会创建一个小的VBA脚本,将文件夹中的所有TXT文件转换为XLS文件,然后将每个文件加载到SQL Server表中,如上所述。
select *
into SQLServerTable FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;Database=C:\your_path_here\test.xls;HDR=YES',
'SELECT * FROM [Sheet1$]')
有关详细信息,请参阅此处。
关于将TXT文件转换为XLS文件的过程,请尝试此操作。
Private Declare Function SetCurrentDirectoryA Lib _
"kernel32" (ByVal lpPathName As String) As Long
Public Function ChDirNet(szPath As String) As Boolean
'based on Rob Bovey's code
Dim lReturn As Long
lReturn = SetCurrentDirectoryA(szPath)
ChDirNet = CBool(lReturn <> 0)
End Function
Sub Get_TXT_Files()
'For Excel 2000 and higher
Dim Fnum As Long
Dim mysheet As Worksheet
Dim basebook As Workbook
Dim TxtFileNames As Variant
Dim QTable As QueryTable
Dim SaveDriveDir As String
Dim ExistFolder As Boolean
'Save the current dir
SaveDriveDir = CurDir
'You can change the start folder if you want for
'GetOpenFilename,you can use a network or local folder.
'For example ChDirNet("C:\Users\Ron\test")
'It now use Excel's Default File Path
ExistFolder = ChDirNet("C:\your_path_here\Text\")
If ExistFolder = False Then
MsgBox "Error changing folder"
Exit Sub
End If
TxtFileNames = Application.GetOpenFilename _
(filefilter:="TXT Files (*.txt), *.txt", MultiSelect:=True)
If IsArray(TxtFileNames) Then
On Error GoTo CleanUp
With Application
.ScreenUpdating = False
.EnableEvents = False
End With
'Add workbook with one sheet
Set basebook = Workbooks.Add(xlWBATWorksheet)
'Loop through the array with txt files
For Fnum = LBound(TxtFileNames) To UBound(TxtFileNames)
'Add a new worksheet for the name of the txt file
Set mysheet = Worksheets.Add(After:=basebook. _
Sheets(basebook.Sheets.Count))
On Error Resume Next
mysheet.Name = Right(TxtFileNames(Fnum), Len(TxtFileNames(Fnum)) - _
InStrRev(TxtFileNames(Fnum), "\", , 1))
On Error GoTo 0
With ActiveSheet.QueryTables.Add(Connection:= _
"TEXT;" & TxtFileNames(Fnum), Destination:=Range("A1"))
.TextFilePlatform = xlWindows
.TextFileStartRow = 1
'This example use xlDelimited
'See a example for xlFixedWidth below the macro
.TextFileParseType = xlDelimited
'Set your Delimiter to true
.TextFileTabDelimiter = True
.TextFileSemicolonDelimiter = False
.TextFileCommaDelimiter = False
.TextFileSpaceDelimiter = False
'Set the format for each column if you want (Default = General)
'For example Array(1, 9, 1) to skip the second column
.TextFileColumnDataTypes = Array(1, 9, 1)
'xlGeneralFormat General 1
'xlTextFormat Text 2
'xlMDYFormat Month-Day-Year 3
'xlDMYFormat Day-Month-Year 4
'xlYMDFormat Year-Month-Day 5
'xlMYDFormat Month-Year-Day 6
'xlDYMFormat Day-Year-Month 7
'xlYDMFormat Year-Day-Month 8
'xlSkipColumn Skip 9
' Get the data from the txt file
.Refresh BackgroundQuery:=False
End With
ActiveSheet.QueryTables(1).Delete
Next Fnum
'Delete the first sheet of basebook
On Error Resume Next
Application.DisplayAlerts = False
basebook.Worksheets(1).Delete
Application.DisplayAlerts = True
On Error GoTo 0
CleanUp:
ChDirNet SaveDriveDir
With Application
.ScreenUpdating = True
.EnableEvents = True
End With
End If
End Sub
您可以设置Windows Scheduler以根据需要自动为您运行该过程。