我正在使用带有python 2.7和pyodbc的mac来查询来自Microsoft Sql server的数据。 有一个时间戳列,在我的数据框
中显示为datetime64 [ns]程序结构 -
SQLCommand = (" SELECT Col1, Col2, Col3 from xyztable ")
DF = pd.read_sql(SQLCommand,cnxn)
# extracting Day and month by converting to dt
DF['TS']=DF['TS'].dt.strftime('%d%m%')
# Create labels from Categories (string type data column in SQL table), replacing each category
DF['Flag']= DF['CODE']
DF.dtypes
TS datetime64[ns]
TIWOR object
CODES object
T-enc int8
TS object
TS_HHMM object
TS_DD int64
TS_DDMM int64
Flag object
dtype: object
# I am able to replace all categories but it fails at this step as u\2013 appears in the middle of string
DF['Flag'].unique()
array([0, 1, nan, u'Dev \u2013 Env'], dtype=object)
# All attempts to find and replace are not working, some records have 'nan' values and DF.dropna does not work.
尝试修复
DF.to_csv('~/SQLoutput.csv', sep='\t', encoding='utf-8')
DF=pd.read_excel('/Users/User1/SQLoutput.xlsx',sheet_name=0,encoding='utf-8')
# -*- coding:utf-8 -*-
,没有帮助tsql -S sqlservername -U Username -P Password
答案 0 :(得分:0)
虽然在pandas中没有尝试过,但您可以使用解决unicode
个问题(这只是一个记录的示例,请尝试对整个列应用相同的内容):
import unidecode
record = unidecode.unidecode_expect_nonascii(record)