固定宽度文件到sql server宽表

时间:2017-10-05 17:06:49

标签: sql-server fixed-width

我有一个固定宽度数据的文本文件。文本文件包含ID,Data,State。 ID只是一个INT。州有2个CHAR州名。数据是在该州注册的公司的信息。

要将它们放在SQL Server表中,我首先将文本文件转储到名为dbo.rawcompanyinfo_delimited的SQLServer表中

注意:为了便于解释我在数据列中只显示了4列。它有500列。

CREATE TABLE dbo.rawcompanyinfo_delimited(ID smallint NOT NULL, Data VARCHAR(MAX) NULL, State CHAR(2));

INSERT INTO dbo.rawcompanyinfo_delimited(100,'ABCINC    111  333.5 USD','PA')
INSERT INTO dbo.rawcompanyinfo_delimited(200,'APPLE     213  333.5 USD','PA')
INSERT INTO dbo.rawcompanyinfo_delimited(300,'BTEC      100  123.5 USD','PA')
INSERT INTO dbo.rawcompanyinfo_delimited(400,'S INC     123  333.0 USD','PA')
INSERT INTO dbo.rawcompanyinfo_delimited(500,'B INC     145  123.2 USD','PA')

我还有一个映射表,告诉我在map.CompaniesLenInfo中存储的Startingposition,length和Columnnames

CREATE TABLE map.CompaniesLenInfo(Startingposition int not null, Length int not null, columnnames varchar(100) not null)
insert into map.CompaniesLenInfo(1,10,CompanyName)
insert into map.CompaniesLenInfo(11,3,CompanyID)
insert into map.CompaniesLenInfo(15,5,TotalIncome)
insert into map.CompaniesLenInfo(21,3,Currency)

我编写了一个嵌套游标,它遍历map.CompaniesLenInfo,然后通过dbo.rawcompanyinfo_delimited,并将结果存储在一个表中。 如下所示

CREATE TABLE dbo.output(ID INT , CompanyName VARCHAR(10),CompanyID VARCHAR(3),TotalIncome VARCHAR(5),Currency VARCHAR(3)) ;
INSERT INTO dbo.output(100,'ABCINC','111','333.5','USD','PA')
INSERT INTO dbo.output(200,'APPLE','213','333.5', 'USD','PA')
INSERT INTO dbo.output(300,'BTEC'      ,'100',  '123.5', 'USD','PA')
INSERT INTO dbo.output(400,'S INC'     ,'123',  '333.0', 'USD','PA')
INSERT INTO dbo.output(500,'B INC'     ,'145',  '123.2', 'USD','PA')
DECLARE @ID INT,@StartPosition INT,@Len INT;
DECLARE @Data NVARCHAR(MAX), @ColumnName VARCHAR(100),@Val VARCHAR(MAX),@CompanyID CHAR(9),@State_Code  VARCHAR(2);
DECLARE @Currency VARCHAR(10);
DECLARE @FinCursor AS CURSOR;

DECLARE @ParsingCursor AS CURSOR;
SET @FinCursor=CURSOR FAST_FORWARD FOR SELECT ID,Data 
FROM map.CompaniesLenInfo WHERE State='PA';
OPEN @FinCursor;
FETCH NEXT FROM @FinCursor INTO @ID,@Data;
WHILE @@FETCH_STATUS = 0 
BEGIN


  SET @ParsingCursor = CURSOR FAST_FORWARD FOR SELECT StartPosition,Length,columnnames
  FROM dbo.rawcompanyinfo_delimited;
  OPEN @ParsingCursor;
  FETCH NEXT FROM @ParsingCursor INTO @StartPosition, @Len, @ColumnName;
  WHILE @@FETCH_STATUS = 0 
      BEGIN
           SET @Val = SUBSTRING(@Data,@StartPosition, @Len);
           /* Not sure how to insert into dbo.output*/

          FETCH NEXT FROM @ParsingCursor INTO @StartPosition, @Len, @ColumnName;
      END

  CLOSE @ParsingCursor;
  DEALLOCATE @ParsingCursor;


FETCH NEXT FROM @FinCursor INTO @ID,@Data;

END
CLOSE @FinCursor;
DEALLOCATE @FinCursor;

如何作为宽表插入。是否有人建议任何其他可能更快的方法?

3 个答案:

答案 0 :(得分:1)

使用命令行工具bcp,批量复制程序。 bcp将解析并加载您的数据。您需要定义“格式文件”以定义固定宽度列的开始和停止位置。如果您可以获取由另一个字符(例如〜)分隔的数据,那么定义列分隔符和行终止符会更容易。

以下是文档参考:https://docs.microsoft.com/en-us/sql/relational-databases/import-export/use-a-format-file-to-bulk-import-data-sql-server

答案 1 :(得分:1)

以下是使用PIVOT的示例:

CREATE TABLE dbo.rawcompanyinfo_delimited(ID smallint NOT NULL,Data VARCHAR(MAX)NULL,State CHAR(2));

INSERT INTO dbo.rawcompanyinfo_delimited(id,data,State)values(100,'ABCINC    111  333.5 USD','PA')
INSERT INTO dbo.rawcompanyinfo_delimited(id,data,State)values(200,'APPLE     213  333.5 USD','PA')
INSERT INTO dbo.rawcompanyinfo_delimited(id,data,State)values(300,'BTEC      100  123.5 USD','PA')
INSERT INTO dbo.rawcompanyinfo_delimited(id,data,State)values(400,'S INC     123  333.0 USD','PA')
INSERT INTO dbo.rawcompanyinfo_delimited(id,data,State)values(500,'B INC     145  123.2 USD','PA')

CREATE TABLE CompaniesLenInfo(Startingposition int not null, Length int not null, columnnames varchar(100) not null)
insert into CompaniesLenInfo(Startingposition,Length,columnnames)VALUES(1,10,'CompanyName')
insert into CompaniesLenInfo(Startingposition,Length,columnnames)VALUES(11,3,'CompanyID')
insert into CompaniesLenInfo(Startingposition,Length,columnnames)VALUES(15,5,'TotalIncome')
insert into CompaniesLenInfo(Startingposition,Length,columnnames)VALUES(21,3,'Currency')

SELECT *
FROM (
    SELECT r.ID,r.State,SUBSTRING(r.Data,ci.Startingposition,ci.Length) AS val,ci.columnnames 
    FROM rawcompanyinfo_delimited AS r,CompaniesLenInfo AS ci
) AS t PIVOT(MAX(val) FOR columnnames IN (CompanyName,CompanyID,TotalIncome,Currency) ) p
+-----+-------+-------------+-----------+-------------+----------+
| ID  | State | CompanyName | CompanyID | TotalIncome | Currency |
+-----+-------+-------------+-----------+-------------+----------+
| 100 | PA    | ABCINC      | 111       |  333.       |  US      |
| 200 | PA    | APPLE       | 213       |  333.       |  US      |
| 300 | PA    | BTEC        | 100       |  123.       |  US      |
| 400 | PA    | S INC       | 123       |  333.       |  US      |
| 500 | PA    | B INC       | 145       |  123.       |  US      |
+-----+-------+-------------+-----------+-------------+----------+

答案 2 :(得分:0)

无需使用光标。要将原始数据表中的数据加载到输出表中,可以使用单个(如果有500列,可能有点冗长)INSERT SELECT语句:

INSERT INTO dbo.output(ID,CompanyName,CompanyID,TotalIncome,Currency)
SELECT r.ID, 
      (SELECT Substring(r.Data, m.startingposition, m.length)
         FROM map.CompaniesLenInfo AS m
        WHERE m.columnnames = 'CompanyName'),
      ...
      (SELECT Substring(r.Data, m.startingposition, m.length)
         FROM map.CompaniesLenInfo AS m
        WHERE m.columnnames = 'Currency')
  FROM dbo.rawcompanyinfo_delimited AS r