这是我写过的最大的帮助请求。我目前的流程是尝试缩小范围,直到获得可能产生错误的最小代码。
我已经从https://www.ffiec.gov/npw/FinancialReport/DataDownload下载了attribute_active csv表。
我正在尝试将其加载到SQL Server中的表中。它不会直接导入(最终需要自动完成,因此无论如何我都需要弄清楚python)。但它不会直接导入。
有趣的注释:我可以将此文件直接导入到MS Access数据库中。它产生一个错误表。如果我告诉我双引号字符用于字符串,那么它将正确导入。
所以我用它来生成错误表,例如,它告诉我第53行是一个问题(如果包含标题,则实际上是54行)
访问导入错误的前几行如下:
Error Field Row
Type Conversion Failure ORG_TYPE_CD 53
Type Conversion Failure STATE_CD 53
Type Conversion Failure ID_FDIC_CERT 53
Type Conversion Failure SLHC_TYPE_IND 53
Type Conversion Failure CNTRY_INC_CD 53
Type Conversion Failure ORG_TYPE_CD 56
Type Conversion Failure STATE_CD 56
Type Conversion Failure ID_FDIC_CERT 56
Type Conversion Failure SLHC_TYPE_IND 56
Type Conversion Failure CNTRY_INC_CD 56
Type Conversion Failure ORG_TYPE_CD 523
Type Conversion Failure STATE_CD 523
Type Conversion Failure ID_FDIC_CERT 523
Type Conversion Failure SLHC_TYPE_IND 523
Type Conversion Failure CNTRY_INC_CD 523
Type Conversion Failure ID_FDIC_CERT 610
Type Conversion Failure SLHC_TYPE_IND 610
Type Conversion Failure CNTRY_INC_CD 610
Type Conversion Failure ORG_TYPE_CD 714
Type Conversion Failure STATE_CD 714
Type Conversion Failure ID_FDIC_CERT 714
Type Conversion Failure SLHC_TYPE_IND 714
Type Conversion Failure CNTRY_INC_CD 714
Type Conversion Failure ORG_TYPE_CD 759
Type Conversion Failure STATE_CD 759
Type Conversion Failure ID_FDIC_CERT 759
Type Conversion Failure SLHC_TYPE_IND 759
Type Conversion Failure CNTRY_INC_CD 759
Type Conversion Failure ORG_TYPE_CD 796
Type Conversion Failure STATE_CD 796
Type Conversion Failure ID_FDIC_CERT 796
Type Conversion Failure SLHC_TYPE_IND 796
Type Conversion Failure CNTRY_INC_CD 796
问题是我查看这些行和这些字段的原始文本,将它们与相邻行中的相同字段区分开来并不奇怪。但是,我确实认为,当我明确告知访问权限对字符串使用双引号时,不会出现导入错误这一事实很重要-但我对熊猫也做同样的事情!
我正在使用以下代码尝试将其读入SQL Server:
def ingest_npw_attributes_active():
filename = 'CSV_ATTRIBUTES_ACTIVE.CSV'
full_filename = os.path.join(raw_data_dir, filename)
print(full_filename)
dt = {'AUTH_REG_DIST_FRS': int,
'STREET_LINE2': "string",
'ID_THRIFT': "string",
'ID_TAX': int
}
init_connection()
df = pd.read_csv(full_filename, dtype=dt, header=0, quotechar='"')
place_holder = "?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,"
place_holder += "?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?"
sql_str = "exec dbo.save_attributes_active " + place_holder
# build assignment string for sql
values = "("
for col_name in df.columns:
values += "row['" + col_name + "'], "
values = values[:-1] + ")"
for index, row in df.iterrows():
if index < 2:
print(row)
values = (row['#ID_RSSD'], row['D_DT_START'], row['D_DT_END'], row['BHC_IND'], row['BROAD_REG_CD'],
row['CHTR_AUTH_CD'], row['CHTR_TYPE_CD'], row['FBO_4C9_IND'], row['FHC_IND'], row['FUNC_REG'],
row['INSUR_PRI_CD'], row['MBR_FHLBS_IND'], row['MBR_FRS_IND'], row['SEC_RPTG_STATUS'],
row['EST_TYPE_CD'], row['BANK_CNT'], row['BNK_TYPE_ANALYS_CD'], row['D_DT_EXIST_CMNC'],
row['D_DT_EXIST_TERM'], row['FISC_YREND_MMDD'], row['D_DT_INSUR'], row['D_DT_OPEN'],
row['FNCL_SUB_HOLDER'], row['FNCL_SUB_IND'], row['IBA_GRNDFTHR_IND'], row['IBF_IND'],
row['ID_RSSD_HD_OFF'], row['MJR_OWN_MNRTY'], row['NM_LGL'], row['NM_SHORT'], row['NM_SRCH_CD'],
row['ORG_TYPE_CD'], row['REASON_TERM_CD'], row['CNSRVTR_CD'], row['ENTITY_TYPE'],
row['AUTH_REG_DIST_FRS'], row['ACT_PRIM_CD'], row['CITY'], row['CNTRY_NM'], row['ID_CUSIP'],
row['STATE_ABBR_NM'], row['PLACE_CD'], row['STATE_CD'], row['STATE_HOME_CD'], row['STREET_LINE1'],
row['STREET_LINE2'], row['ZIP_CD'], row['ID_THRIFT'], row['ID_THRIFT_HC'], row['DOMESTIC_IND'],
row['ID_ABA_PRIM'], row['ID_FDIC_CERT'], row['ID_NCUA'], row['COUNTY_CD'], row['DIST_FRS'],
row['ID_OCC'], row['CNTRY_CD'], row['DT_END'], row['DT_EXIST_CMNC'], row['DT_EXIST_TERM'],
row['DT_INSUR'], row['DT_OPEN'], row['DT_START'], row['ID_TAX'], row['PROV_REGION'], row['URL'],
row['SLHC_IND'], row['SLHC_TYPE_IND'], row['PRIM_FED_REG'], row['STATE_INC_CD'], row['CNTRY_INC_CD'],
row['STATE_INC_ABBR_NM'], row['CNTRY_INC_NM'], row['ID_LEI'], row['IHC_IND'])
# print(values)
return_key = cursor.execute(sql_str, values).fetchval()
print('return_key =', return_key)
conn.close()
return
我的SQL Server表定义如下:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[npwAttributesActive]
(
[id] [bigint] IDENTITY(1,1) NOT NULL,
[ID_RSSD] [int] NOT NULL,
[D_DT_START] [nvarchar](50) NULL,
[D_DT_END] [nvarchar](50) NULL,
[BHC_IND] [int] NULL,
[BROAD_REG_CD] [int] NULL,
[CHTR_AUTH_CD] [int] NULL,
[CHTR_TYPE_CD] [int] NULL,
[FBO_4C9_IND] [int] NULL,
[FHC_IND] [int] NULL,
[FUNC_REG] [int] NULL,
[INSUR_PRI_CD] [int] NULL,
[MBR_FHLBS_IND] [int] NULL,
[MBR_FRS_IND] [int] NULL,
[SEC_RPTG_STATUS] [int] NULL,
[EST_TYPE_CD] [int] NULL,
[BANK_CNT] [nvarchar](1) NULL,
[BNK_TYPE_ANALYS_CD] [int] NULL,
[D_DT_EXIST_CMNC] [nvarchar](50) NULL,
[D_DT_EXIST_TERM] [nvarchar](50) NULL,
[FISC_YREND_MMDD] [int] NULL,
[D_DT_INSUR] [nvarchar](50) NULL,
[D_DT_OPEN] [nvarchar](50) NULL,
[FNCL_SUB_HOLDER] [int] NULL,
[FNCL_SUB_IND] [int] NULL,
[IBA_GRNDFTHR_IND] [int] NULL,
[IBF_IND] [int] NULL,
[ID_RSSD_HD_OFF] [int] NULL,
[MJR_OWN_MNRTY] [int] NULL,
[NM_LGL] [nvarchar](150) NULL,
[NM_SHORT] [nvarchar](50) NULL,
[NM_SRCH_CD] [int] NULL,
[ORG_TYPE_CD] [int] NULL,
[REASON_TERM_CD] [int] NULL,
[CNSRVTR_CD] [int] NULL,
[ENTITY_TYPE] [nvarchar](50) NULL,
[AUTH_REG_DIST_FRS] [int] NULL,
[ACT_PRIM_CD] [nvarchar](50) NULL,
[CITY] [nvarchar](50) NULL,
[CNTRY_NM] [nvarchar](50) NULL,
[ID_CUSIP] [nvarchar](50) NULL,
[STATE_ABBR_NM] [nvarchar](50) NULL,
[PLACE_CD] [int] NULL,
[STATE_CD] [int] NULL,
[STATE_HOME_CD] [int] NULL,
[STREET_LINE1] [nvarchar](50) NULL,
[STREET_LINE2] [nvarchar](50) NULL,
[ZIP_CD] [nvarchar](50) NULL,
[ID_THRIFT] [int] NULL,
[ID_THRIFT_HC] [nvarchar](50) NULL,
[DOMESTIC_IND] [nvarchar](50) NULL,
[ID_ABA_PRIM] [int] NULL,
[ID_FDIC_CERT] [int] NULL,
[ID_NCUA] [int] NULL,
[COUNTY_CD] [int] NULL,
[DIST_FRS] [int] NULL,
[ID_OCC] [int] NULL,
[CNTRY_CD] [int] NULL,
[DT_END] [int] NULL,
[DT_EXIST_CMNC] [int] NULL,
[DT_EXIST_TERM] [int] NULL,
[DT_INSUR] [int] NULL,
[DT_OPEN] [int] NULL,
[DT_START] [int] NULL,
[ID_TAX] [int] NULL,
[PROV_REGION] [nvarchar](50) NULL,
[URL] [nvarchar](50) NULL,
[SLHC_IND] [int] NULL,
[SLHC_TYPE_IND] [int] NULL,
[PRIM_FED_REG] [nvarchar](50) NULL,
[STATE_INC_CD] [int] NULL,
[CNTRY_INC_CD] [int] NULL,
[STATE_INC_ABBR_NM] [nvarchar](50) NULL,
[CNTRY_INC_NM] [nvarchar](50) NULL,
[ID_LEI] [nvarchar](50) NULL,
[IHC_IND] [int] NULL
) ON [PRIMARY]
存储过程是
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[save_attributes_active]
@ID_RSSD int,
@D_DT_START nvarchar(50),
@D_DT_END nvarchar(50),
@BHC_IND int,
@BROAD_REG_CD int,
@CHTR_AUTH_CD int,
@CHTR_TYPE_CD int,
@FBO_4C9_IND int,
@FHC_IND int,
@FUNC_REG int,
@INSUR_PRI_CD int,
@MBR_FHLBS_IND int,
@MBR_FRS_IND int,
@SEC_RPTG_STATUS int,
@EST_TYPE_CD int,
@BANK_CNT nvarchar(1),
@BNK_TYPE_ANALYS_CD int,
@D_DT_EXIST_CMNC nvarchar(50),
@D_DT_EXIST_TERM nvarchar(50),
@FISC_YREND_MMDD int,
@D_DT_INSUR nvarchar(50),
@D_DT_OPEN nvarchar(50),
@FNCL_SUB_HOLDER int,
@FNCL_SUB_IND int,
@IBA_GRNDFTHR_IND int,
@IBF_IND int,
@ID_RSSD_HD_OFF int,
@MJR_OWN_MNRTY int,
@NM_LGL nvarchar(150),
@NM_SHORT nvarchar(50),
@NM_SRCH_CD int,
@ORG_TYPE_CD int,
@REASON_TERM_CD int,
@CNSRVTR_CD int,
@ENTITY_TYPE nvarchar(50),
@AUTH_REG_DIST_FRS int,
@ACT_PRIM_CD nvarchar(50),
@CITY nvarchar(50),
@CNTRY_NM nvarchar(50),
@ID_CUSIP nvarchar(50),
@STATE_ABBR_NM nvarchar(50),
@PLACE_CD int,
@STATE_CD int,
@STATE_HOME_CD int,
@STREET_LINE1 nvarchar(50),
@STREET_LINE2 nvarchar(50),
@ZIP_CD nvarchar(50),
@ID_THRIFT int,
@ID_THRIFT_HC nvarchar(50),
@DOMESTIC_IND nvarchar(50),
@ID_ABA_PRIM int,
@ID_FDIC_CERT int,
@ID_NCUA int,
@COUNTY_CD int,
@DIST_FRS int,
@ID_OCC int,
@CNTRY_CD int,
@DT_END int,
@DT_EXIST_CMNC int,
@DT_EXIST_TERM int,
@DT_INSUR int,
@DT_OPEN int,
@DT_START int,
@ID_TAX int,
@PROV_REGION nvarchar(50),
@URL nvarchar(50),
@SLHC_IND int,
@SLHC_TYPE_IND int,
@PRIM_FED_REG nvarchar(50),
@STATE_INC_CD int,
@CNTRY_INC_CD int,
@STATE_INC_ABBR_NM nvarchar(50),
@CNTRY_INC_NM nvarchar(50),
@ID_LEI nvarchar(50),
@IHC_IND int
AS
BEGIN
SET NOCOUNT ON;
INSERT INTO dbo.npwAttributesActive ([ID_RSSD], [D_DT_START], [D_DT_END], [BHC_IND], [BROAD_REG_CD], [CHTR_AUTH_CD], [CHTR_TYPE_CD], [FBO_4C9_IND], [FHC_IND], [FUNC_REG], [INSUR_PRI_CD],
[MBR_FHLBS_IND], [MBR_FRS_IND], [SEC_RPTG_STATUS], [EST_TYPE_CD], [BANK_CNT], [BNK_TYPE_ANALYS_CD], [D_DT_EXIST_CMNC], [D_DT_EXIST_TERM], [FISC_YREND_MMDD], [D_DT_INSUR], [D_DT_OPEN],
[FNCL_SUB_HOLDER], [FNCL_SUB_IND], [IBA_GRNDFTHR_IND], [IBF_IND], [ID_RSSD_HD_OFF], [MJR_OWN_MNRTY], [NM_LGL], [NM_SHORT], [NM_SRCH_CD], [ORG_TYPE_CD], [REASON_TERM_CD], [CNSRVTR_CD],
[ENTITY_TYPE], [AUTH_REG_DIST_FRS], [ACT_PRIM_CD], [CITY], [CNTRY_NM], [ID_CUSIP], [STATE_ABBR_NM], [PLACE_CD], [STATE_CD], [STATE_HOME_CD], [STREET_LINE1], [STREET_LINE2], [ZIP_CD],
[ID_THRIFT], [ID_THRIFT_HC], [DOMESTIC_IND], [ID_ABA_PRIM], [ID_FDIC_CERT], [ID_NCUA], [COUNTY_CD], [DIST_FRS], [ID_OCC], [CNTRY_CD], [DT_END], [DT_EXIST_CMNC], [DT_EXIST_TERM], [DT_INSUR],
[DT_OPEN], [DT_START], [ID_TAX], [PROV_REGION], [URL], [SLHC_IND], [SLHC_TYPE_IND], [PRIM_FED_REG], [STATE_INC_CD], [CNTRY_INC_CD],
[STATE_INC_ABBR_NM], [CNTRY_INC_NM], [ID_LEI], [IHC_IND] )
VALUES (@ID_RSSD, @D_DT_START, @D_DT_END, @BHC_IND, @BROAD_REG_CD, @CHTR_AUTH_CD, @CHTR_TYPE_CD, @FBO_4C9_IND, @FHC_IND, @FUNC_REG, @INSUR_PRI_CD,
@MBR_FHLBS_IND, @MBR_FRS_IND, @SEC_RPTG_STATUS, @EST_TYPE_CD, @BANK_CNT, @BNK_TYPE_ANALYS_CD, @D_DT_EXIST_CMNC, @D_DT_EXIST_TERM, @FISC_YREND_MMDD,
@D_DT_INSUR, @D_DT_OPEN, @FNCL_SUB_HOLDER, @FNCL_SUB_IND, @IBA_GRNDFTHR_IND, @IBF_IND, @ID_RSSD_HD_OFF, @MJR_OWN_MNRTY, @NM_LGL, @NM_SHORT, @NM_SRCH_CD,
@ORG_TYPE_CD, @REASON_TERM_CD, @CNSRVTR_CD, @ENTITY_TYPE, @AUTH_REG_DIST_FRS, @ACT_PRIM_CD, @CITY, @CNTRY_NM, @ID_CUSIP, @STATE_ABBR_NM, @PLACE_CD, @STATE_CD,
@STATE_HOME_CD, @STREET_LINE1, @STREET_LINE2, @ZIP_CD, @ID_THRIFT, @ID_THRIFT_HC, @DOMESTIC_IND, @ID_ABA_PRIM, @ID_FDIC_CERT, @ID_NCUA, @COUNTY_CD, @DIST_FRS,
@ID_OCC, @CNTRY_CD, @DT_END, @DT_EXIST_CMNC, @DT_EXIST_TERM, @DT_INSUR, @DT_OPEN, @DT_START, @ID_TAX, @PROV_REGION, @URL, @SLHC_IND, @SLHC_TYPE_IND,
@PRIM_FED_REG, @STATE_INC_CD, @CNTRY_INC_CD, @STATE_INC_ABBR_NM, @CNTRY_INC_NM, @ID_LEI, @IHC_IND
)
SELECT 1
END
但是,出现以下错误:
Traceback (most recent call last):
File "C:/Users/kgreen/Source/Repos/MapTools/ingest_source.py", line 142, in <module>
main()
File "C:/Users/kgreen/Source/Repos/MapTools/ingest_source.py", line 136, in main
ingest_npw_data()
File "C:/Users/kgreen/Source/Repos/MapTools/ingest_source.py", line 130, in ingest_npw_data
ingest_npw_attributes_active()
File "C:/Users/kgreen/Source/Repos/MapTools/ingest_source.py", line 122, in ingest_npw_attributes_active
return_key = cursor.execute(sql_str, values).fetchval()
pyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Parameter 19 (""): The supplied value is not a valid instance of data type float. Check the source data for invalid values. An example of an invalid value is data of numeric type with scale greater than precision. (8023) (SQLExecDirectW)')
附录 CSV的前几行
#ID_RSSD,D_DT_START,D_DT_END,BHC_IND,BROAD_REG_CD,CHTR_AUTH_CD,CHTR_TYPE_CD,FBO_4C9_IND,FHC_IND,FUNC_REG,INSUR_PRI_CD,MBR_FHLBS_IND,MBR_FRS_IND,SEC_RPTG_STATUS,EST_TYPE_CD,BANK_CNT,BNK_TYPE_ANALYS_CD,D_DT_EXIST_CMNC,D_DT_EXIST_TERM,FISC_YREND_MMDD,D_DT_INSUR,D_DT_OPEN,FNCL_SUB_HOLDER,FNCL_SUB_IND,IBA_GRNDFTHR_IND,IBF_IND,ID_RSSD_HD_OFF,MJR_OWN_MNRTY,NM_LGL,NM_SHORT,NM_SRCH_CD,ORG_TYPE_CD,REASON_TERM_CD,CNSRVTR_CD,ENTITY_TYPE,AUTH_REG_DIST_FRS,ACT_PRIM_CD,CITY,CNTRY_NM,ID_CUSIP,STATE_ABBR_NM,PLACE_CD,STATE_CD,STATE_HOME_CD,STREET_LINE1,STREET_LINE2,ZIP_CD,ID_THRIFT,ID_THRIFT_HC,DOMESTIC_IND,ID_ABA_PRIM,ID_FDIC_CERT,ID_NCUA,COUNTY_CD,DIST_FRS,ID_OCC,CNTRY_CD,DT_END,DT_EXIST_CMNC,DT_EXIST_TERM,DT_INSUR,DT_OPEN,DT_START,ID_TAX,PROV_REGION,URL,SLHC_IND,SLHC_TYPE_IND,PRIM_FED_REG,STATE_INC_CD,CNTRY_INC_CD,STATE_INC_ABBR_NM,CNTRY_INC_NM,ID_LEI,IHC_IND
37,"04/15/2009 00:00:00","12/31/9999 00:00:00",0,1,2,200,0,0,0,7,1,0,0,1,,0,,"12/31/9999 00:00:00",0,"01/01/1934 00:00:00","09/01/1904 00:00:00",0,0,0,0,0,0,"BANK OF HANCOCK COUNTY ","BANK OF HANCOCK CTY ",1072861144,1,0,0,"NMB",6,"52211 ","SPARTA","UNITED STATES ","0","GA",72584,13,0,"12855 BROAD STREET","0","31087 ",16553,"0","Y",61107146,10057,0,141,6,0,1007,99991231,0,99991231,19340101,19040901,20090415,0,"0","0",0,0,"FDIC",0,0,"0 ","","0",0
73,"12/31/2008 00:00:00","12/31/9999 00:00:00",0,2,1,330,0,0,0,3,0,0,0,1,,0,,"12/31/9999 00:00:00",0,"01/04/1971 00:00:00","01/01/1936 00:00:00",0,0,0,0,0,0,"UTILITY EMPLOYEES FEDERAL CREDIT UNION ","UTILITY EMPL FCU ",788018087,6,0,0,"FCU",12,"52213 ","HOQUIAM","UNITED STATES ","0","WA",0,53,0,"220 MYRTLE STREET","0","98550 ",0,"0","Y",325179988,0,1851,27,12,0,1007,99991231,0,99991231,19710104,19360101,20081231,910591861,"0","0",0,0,"NCUA",0,0,"0 ","","0",0
242,"01/01/2012 00:00:00","12/31/9999 00:00:00",0,1,2,200,0,0,0,7,1,1,0,1,,0,,"12/31/9999 00:00:00",0,"01/01/1934 00:00:00","01/01/1922 00:00:00",0,0,0,0,0,0,"FIRST COMMUNITY BANK XENIA-FLORA ","FIRST CMNTY BK XENIA FLORA ",574907456,1,0,0,"SMB",8,"52211 ","XENIA","UNITED STATES ","0","IL",83739,17,0,"260 FRONT STREET","0","62899 ",0,"0","Y",81220537,3850,0,25,8,0,1007,99991231,0,99991231,19340101,19220101,20120101,370274860,"0","WWW.FCBXENIAFLORA.COM/INDEX.HTML",0,0,"FRS",17,0,"IL","","0",0
279,"01/01/2012 00:00:00","12/31/9999 00:00:00",0,1,2,300,0,0,0,7,1,0,0,1,,0,,"12/31/9999 00:00:00",0,"01/01/1997 00:00:00","01/01/1934 00:00:00",0,0,0,0,0,0,"MINEOLA COMMUNITY BANK, SSB ","MINEOLA CMNTY BK SSB ",98889854,6,0,0,"SSB",11,"52211 ","MINEOLA","UNITED STATES ","0","TX",48648,48,0,"215 W BROAD","0","75773 ",2523,"0","Y",311972526,28868,0,499,11,0,1007,99991231,0,99991231,19970101,19340101,20120101,750440734,"0","0",0,0,"FDIC",48,0,"TX","","0",0
354,"12/04/2019 00:00:00","12/31/9999 00:00:00",0,1,2,200,0,0,0,7,0,0,0,1,,0,"01/01/1901 00:00:00","12/31/9999 00:00:00",0,"03/21/1934 00:00:00","01/01/1901 00:00:00",0,0,0,0,0,0,"BISON STATE BANK ","BISON ST BK ",715161421,1,0,0,"NMB",10,"52211 ","BISON","UNITED STATES ","0","KS",6950,20,0,"223 MAIN STREET","0","67520 ",0,"0","Y",101107475,14083,0,165,10,0,1007,99991231,19010101,99991231,19340321,19010101,20191204,0,"0","0",0,0,"FDIC",0,0,"0 ","","0",0
答案 0 :(得分:1)
实际上没有看到完整的CSV数据集,您的问题实际上可能归结为在CSV单元中显示的空白。
CSV-> MS访问类型转换
没有指定的表,MS Access会根据读取的前几行来导入数据。在指定双引号之前,Access会将""
读取为零空字符串,如果Access最初将相应字段定义为int
,double
或date/time
,则会出现问题。
熊猫-> SQL Server类型转换
同样,如果未指定dtype
参数,则Pandas会根据前几行读取数据。将Pandas数据框迁移到SQL Server表时,可能出现空字符串问题。根据您的错误,您可能有CSV单元格带有""
,并尝试迁移到非字符串列:
参数19(“”):提供的值不是数据类型float的有效实例。
默认情况下,read_csv
将整数列中带有""
的值转换为NaN
,从而将整列呈现为float
,不再int
,这可以影响迁移。可能没有适合您的特定解决方案,但考虑了可能需要结合使用的一系列解决方案:
使用na_filter
中的read_csv
选项,该选项不会用NaN
替换空格,但会将带有空字符串的数字列转换为object
,这可能会影响SQL Server导入。
df = pd.read_csv(full_filename, dtype=dt, na_filter=False)
注意quote='"'
已经是默认设置,如果不替换header=0
,names
可能是多余的。
按常规读取,然后将所有列转换为对象,并用None
替换空格,这可以正确转换为SQL中的NULL
。 pyodbc
可能会尝试转换为最终类型。
df = pd.read_csv(full_filename, dtype=dt)
df = df.astype(object).where(pd.notnull(df), None)
或者,转储到所有varchar
的临时表中,并运行UPDATE
或MERGE
,并将其类型转换为最终表。这都可以在存储过程中处理。
分别调整每个数字列以预测空白。也许将占位符值9999
放在空白整数中,以便在迁移后清除。
df['D_DT_EXIST_TERM'] = df['D_DT_EXIST_TERM'].replace('', None)
df['FISC_YREND_MMDD'] = df['FISC_YREND_MMDD'].fillna(9999).astype('int')
最后,也许您可以完全避免使用最适合数据分析的笨重库pandas
,并使用csv
使用内置csv.reader
模块进行迁移。这样可以更好地处理投射为None
的空格(而不是Numpy的np.nan
float实体)。许多在线博客,教程和问题都可以帮助使用此方法。为了更好地控制CSV列与SQL列的对齐方式,请查看csv.DictReader
。