我正在尝试从csv文档中将一些数据插入到表中,该文档的所有字段都以“”
分隔。即
APPLICANTID,NAME,CONTACT,PHONENO,MOBILENO,FAXNO,EMAIL,ADDR1,ADDR2,ADDR3,STATE,POSTCODE
"3","Snoop Dogg","Snoop Dogg","411","","","","411 High Street","USA
","","USA", "1111" "4","LL Cool J","LL Cool J","","","","","5 King
Street","","","USA","1111"
我正在使用xml格式文件来尝试克服“”分隔符,因为我认为我必须在导入后再次更新数据以删除初始值“如果没有。”
我的格式文件如下所示:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="NCharTerm" TERMINATOR='",' MAX_LENGTH="12"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="5" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="6" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="7" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="8" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="9" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="10" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="11" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="12" xsi:type="CharTerm" TERMINATOR="\r\n" COLLATION="Latin1_General_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="APPLICANTID" xsi:type="SQLINT"/>
<COLUMN SOURCE="2" NAME="NAME" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="3" NAME="CONTACT" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="4" NAME="PHONENO" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="5" NAME="MOBILENO" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="6" NAME="FAXNO" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="7" NAME="EMAIL" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="8" NAME="ADDR1" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="9" NAME="ADDR2" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="10" NAME="ADDR3" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="11" NAME="STATE" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="12" NAME="POSTCODE" xsi:type="SQLCHAR"/>
</ROW>
</BCPFORMAT>
我正在使用以下内容运行导入:
BULK INSERT [PracticalDB].dbo.applicant
FROM 'C:\temp.csv'
WITH (KEEPIDENTITY, FORMATFILE='C:\temp.xml', FIRSTROW = 2)
我收到错误:
Msg 4864,Level 16,State 1,Line 1批量加载数据转换错误 (为指定的代码页键入不匹配或无效字符) 第2栏第1栏(APPLICANTID)。
表示所有行。
我尝试了各种不同的终结器组合,包括使用:
TERMINATOR="","
TERMINATOR="\","
TERMINATOR='","
TERMINATOR='\","
并且它们似乎都不起作用。
是否有正确的方法来逃避“以便正确解析它,假设这是我的问题。
答案 0 :(得分:19)
好的,我明白了!
当您定义xml属性(即TERMINATOR ='')时,您可以使用'而不是',然后您可以使用“在其中而不用担心。”
此外,我需要吃第一个“带字段,以便其他列可以正确解析。这最终得到格式文件
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR='"' />
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="5" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="6" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="7" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="8" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="9" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="10" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="11" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="12" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="13" xsi:type="CharTerm" TERMINATOR='"\r\n' />
</RECORD>
<ROW>
<COLUMN SOURCE="2" NAME="APPLICANTID" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="3" NAME="NAME" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="4" NAME="CONTACT" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="5" NAME="PHONENO" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="6" NAME="MOBILENO" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="7" NAME="FAXNO" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="8" NAME="EMAIL" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="9" NAME="ADDR1" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="10" NAME="ADDR2" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="11" NAME="ADDR3" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="12" NAME="STATE" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="13" NAME="POSTCODE" xsi:type="SQLCHAR"/>
</ROW>
</BCPFORMAT>
第一个字段只是丢弃一个,删除第一个“和其他字段全部分开”,“最后分隔”(换行符)
答案 1 :(得分:2)
提示:如果只有部分字段是doubleqouted,那么使用批量插入的openrowset
版本,这样做,您可以操作来自输入文件的字段内容
在插入目标表之前。
在操作中,您可以对字段内容执行任何操作,例如:删除双引号。这里没有提到对性能的影响,我没有相关的措施。
答案 2 :(得分:1)
提示:如果您的CSV文件格式不一致,例如在同一列上有一些值是双重的,有些不是这个博客会帮助您轻松完成(这是继续Estevez的提示使用openrowset只是最后一步) http://ariely.info/Blog/tabid/83/EntryId/122/Using-Bulk-Insert-to-import-inconsistent-data-format-using-pure-T-SQL.aspx