从CSV文件读取数据并将其插入数据库

时间:2018-12-14 10:13:44

标签: sql-server command-line cmd bcp sqlbulkcopy

我需要从csv文件中读取数据并将其填充到数据库中。我为此使用bcp命令行实用程序。 我的CSV文件如下所示:

First_name,Last_name,EmpID,company,languages
"Jack","Thomas","57616","IBM","C
C++
JAVA
COBOL
PERL
SQL
 "
"Tim","Cook","10001","Apple","Python
C++
Java
XML
 "

如您所见,最后一列(语言)在每个新行上都有值。 bcp命令已编写脚本以检查行定界符,一旦它从最后一列获取了第一个值,它将终止 请您提出如何使用bcp解析的建议?

2 个答案:

答案 0 :(得分:3)

在尝试自行找到解决方案/进行研究时,我没有看到much progress,这是[SO]所期望的。

以下是可能的PowerShell解决方案,用于导入csv,
将多行列转换为用分号分隔的分号并导出为csv。

INSERT INTO tablename (column-names) CALL stored_procedure()

这将导致所有列都用双引号引起来:

Import-Csv .\old.csv| ForEach-Object {
    $_.Languages=$_.Languages -split "`r?`n" -ne ' ' -join ';'
    $_
} | Export-Csv .\New.csv -NoTypeInformation

另一个PowerShell的衬里将对此进行补救:

> Get-Content .\new.csv
"First_name","Last_name","EmpID","company","languages"
"Jack","Thomas","57616","IBM","C;C++;JAVA;COBOL;PERL;SQL"
"Tim","Cook","10001","Apple","Python;C++;Java;XML"

(Get-Content .\new.csv).trim('"') -replace '","',',' | Set-Content .\new.csv

编辑:一个合并的.ps1文件

First_name,Last_name,EmpID,company,languages
Jack,Thomas,57616,IBM,C;C++;JAVA;COBOL;PERL;SQL
Tim,Cook,10001,Apple,Python;C++;Java;XML

答案 1 :(得分:1)

这是一个SQL解决方案:它遍历了导入文件,并将数据解析为两个表。有两个循环。 “主”表一个循环,“详细”表一个循环。

设置

IF EXISTS(SELECT *
          FROM   #tempTable)
  DROP TABLE #tempTable

/*
Create Table emps
(
First_Name Varchar(25),
Last_Name VarChar(25),
EmpID VarChar(10),
Company VarChar(30)
)

Create Table langs
(
EmpID VarChar(10),
Lang VarChar(15)
)
*/

Delete From langs
Delete From emps

CREATE TABLE #tempTable
(
  RowVal VarChar(Max)
)

查询

BULK INSERT #tempTable
FROM 'c:\Downloads\EmpLangs.txt' 
WITH 
(
    FIRSTROW = 2,
    ROWTERMINATOR = '\n'
)

Declare @RowV VarChar(100)
--Use the following to get the location of each delimiter
Declare @f1q1 Int
Declare @f1q2 Int
Declare @f2q1 Int
Declare @f2q2 Int
Declare @f3q1 Int
Declare @f3q2 Int
Declare @f4q1 Int
Declare @f4q2 Int
Declare @f5q1 Int

Declare @empid VarChar(10)

Declare @vHeader Int = 1  --Is header row?

Declare vCursor CURSOR For Select RowVal  From #tempTable

  Open vCursor;
  Fetch Next From vCursor Into @RowV

  While @@FETCH_STATUS = 0  --Walk through rows to parse
  Begin

   If @vHeader = 1
      Begin     
        Set @f1q1 = CHARINDEX('"',@RowV,1)
        Set @f1q2 = CHARINDEX('"',@RowV,@f1q1+1)

        Set @f2q1 = CHARINDEX('"',@RowV,@f1q2+1)
        Set @f2q2 = CHARINDEX('"',@RowV,@f2q1+1)

        Set @f3q1 = CHARINDEX('"',@RowV,@f2q2+1)
        Set @f3q2 = CHARINDEX('"',@RowV,@f3q1+1)

        Set @f4q1 = CHARINDEX('"',@RowV,@f3q2+1)
        Set @f4q2 = CHARINDEX('"',@RowV,@f4q1+1)

        Set @f5q1 = CHARINDEX('"',@RowV,@f4q2+1)

        Insert Into emps Values
        (SUBSTRING(@RowV,@f1q1+1,@f1q2-@f1q1-1),
         SUBSTRING(@RowV,@f2q1+1,@f2q2-@f2q1-1),
         SUBSTRING(@RowV,@f3q1+1,@f3q2-@f3q1-1),
         SUBSTRING(@RowV,@f4q1+1,@f4q2-@f4q1-1) 
        )

        Set @vHeader = 0
        Set @empid = SUBSTRING(@RowV,@f3q1+1,@f3q2-@f3q1-1)
        Insert Into langs Values (@empid,SUBSTRING(@RowV,@f5q1+1,Len(@RowV)- @f5q1 + 1))  -- ADDED to get the trailing language from the header row
      End

     Fetch Next From vCursor Into @RowV
       While @@FETCH_STATUS = 0  And @vHeader = 0 And @RowV <> ' "'
         Begin
            Insert Into langs Values (@empid,@RowV)
            Fetch Next From vCursor Into @RowV
            If @RowV = ' "' 
             Begin
                If @@FETCH_STATUS = 0 
                  Begin
                     Fetch Next From vCursor Into @RowV
                     Set @vHeader = 1
                  End
             End
         End
  End;

  Close vCursor
  Deallocate vCursor

Select e.*,l.lang From emps e
INNER JOIN
langs l ON e.EmpID = l.EmpID

结果

First_Name  Last_Name   EmpID   Company Lang
Jack        Thomas      57616   IBM     C
Jack        Thomas      57616   IBM     C++
Jack        Thomas      57616   IBM     JAVA
Jack        Thomas      57616   IBM     COBOL
Jack        Thomas      57616   IBM     PERL
Jack        Thomas      57616   IBM     SQL
Tim         Cook        10001   Apple   Python
Tim         Cook        10001   Apple   C++
Tim         Cook        10001   Apple   Java
Tim         Cook        10001   Apple   XML