我一直在尝试将csv文件加载到mysql中,并继续获取csv中最后一个字段的数据截断警告。
数据是用python编写的,我确保最后一个字段的字符串长度为13(CREATE TABLE中声明的字段长度):
cleanField( row[ 17 ] )[0:12]
我测量的任何方式len(cleanField( row[ 17 ] )[0:12])
,它是13.当我使用$ cat customer.csv | awk -F"," '(NR==3621789){ print $17 }'
打印出来时,mysql警告中的一行,我仍然看到一个13-char字符串。
但是,当我尝试以下内容时,似乎有一丝隐藏的角色。有什么建议?感谢。
$ cat customer.csv | awk -F"," '(NR==3621789){ print "<" $17 ">" }'
>PRSP_CATS_CO
这是cleanField:
def cleanField(x):
x = re.sub( ' +' , ' ' , x )
try:
x.decode('ascii')
except UnicodeDecodeError:
x = unicode( x , "UTF-8")
x = unicodedata.normalize('NFKD', x ).encode('ascii', 'ignore')
else:
pass
# " ".join(x.split())
return x.replace(',','').replace('"','').replace("'",'').replace('\t','').replace('\n','').replace('\\','').replace('\s','')
答案 0 :(得分:1)
string [0:12]应始终为12个字符。也许你最好用pudb或类似程序逐步完成你的程序。
dstromberg@zareason ~ $ /usr/local/pypy-1.9/bin/pypy
Python 2.7.2 (341e1e3821ff, Jun 07 2012, 15:40:31)
[PyPy 1.9.0 with GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``how to construct the blackhole
interpreter: we reuse the tracing one, add lots of ifs and pray''
>>>> print '01234567890123456789'[0:12]
012345678901
>>>> print(len('01234567890123456789'[0:12]))
12
>>>>