Question

我正在尝试从Excel电子表格中提取邮政编码，并将其作为字符串加载到列表中。

import xlrd
BIL = xlrd.open_workbook(r"C:\Temp\Stores.xls)    
Worksheet = BIL.sheet_by_name("Open_Locations")
ZIPs = []
for record in Worksheet.col(17):
    if record.value == "Zip":
        pass
    else:
        ZIPs.append(record.value)

不幸的是，这个Excel工作簿是由其他人管理的，因此我不能简单地将excel电子表格中包含邮政编码的字段转换为文本以解决我的问题。另外，不管你信不信，这个Excel电子表格也被一些商业智能系统使用。因此，将该字段从数字更改为字符串可能会导致利用此工作簿的其他工作流程出现问题，我并不知情。

我发现的是，当我打印数字而不是首先转换为整数或字符串时，我当然得到了一堆浮点数。我预计，因为Excel将数字存储为浮点数。

>>>Zips
[u'06405',
 04650.0,
 10017.0,
 71055.0,
 70801.0]

我没想到的是当我将这些浮点数转换为int以除去十进制值时，然后将其结果转换为字符串，结果是任何前导零或尾随零都是邮政编码的一部分值被截断。

import xlrd
BIL = xlrd.open_workbook(r"C:\Temp\Stores.xls)    
Worksheet = BIL.sheet_by_name("Open_Locations")
ZIPs = []
for record in Worksheet.col(17):
    if record.value == "Zip":
        pass
    else:
        ZIPs.append(str(int(record.value)))

>>>Zips
['6405',
 '465',
 '10017',
 '71055',
 '70801']

如何在不删除前导零或尾随零的情况下将这些邮政编码转换为字符串，或者在截断前确定值的前导零和尾随零的数量，并将其追加到适当的位置？

Answer 1

所有邮政编码（不包括Zip + 4）均为5个字符，因此您只需填写5个字符：

C＃

使用String.Pad左方法： https://msdn.microsoft.com/en-us/library/system.string.padleft%28v=vs.110%29.aspx
ZIPs.append(str.PadLeft(5, '0');

Python：

使用rjust：http://www.tutorialspoint.com/python/string_rjust.htm
ZIPs.append(str(int(record.value)).rjust(5, '0'))

Answer 2

所以经过一些修修补补后，答案是：

不会将邮政编码转换为int，因为这也会截断任何邮政编码领先的零
将字符串显式编码为utf-8

unicode字符串指示符的存在让我觉得这可能是它出现在某些值时的答案，但是当我打印列表时不是全部

for record in Worksheet.col(17):
    if record.value == "Zip":
        pass
    else:
        # In this case, the value is still being returned as float, because                          
        it has 1 significant digit of 0 appended to the end. So we'll cast 
        as string and explicitly encode it as utf-8 which will retain the 
        leading and trailing zeros of the value and also truncate the 
        significant digits via index.
        if len(str(record.value).encode('utf-8')) > 5 
            ZIPs.append(str(record.value).encode('utf-8'))
        else:
            # In this case, the value is already being returned as a unicode 
            string for some reason, probably because of poor excel worksheet 
            management, but in any case cast as string and explicitly encode 
            as utf-8 just for peace of mind.
            ZIPs.append(str(record.value).encode('utf-8'))

>>>Zips
   ['06405',
    '04650',
    '10017',
    '71055',
    '70801']

如果有人有更优雅的方式这样做，我很乐意看到它。

Answer 3

您可以尝试通过字符串操作来完成此操作。

我们的假设是该列将是邮政编码，因此最后的“.0”将永远不必要。

以下内容将出现在你的声明中：

record_str = str(record.value)
formatted_record = record_str[:-2] if record_str.endswith('.0') else record_str
ZIPs.append(formatted_record )

或者，如果你想要冒险，我们在这里的假设就是读这个专栏总是会有'.0'，否则会导致意外行为。

ZIPs.append(str(record.value)[:-2])

将Float转换为String而不截断前导或尾随零

3 个答案: