我试图为以数字格式提供的zipcode添加一些零。我认为这样可行(它在.str.startswith()
之前的类似情况下有效。有什么建议吗?
data['loczipstr'] = data['loczip'].astype(str)
data['loczipstr'] = np.where(len(data['loczipstr']) == 3, "0000" +data['loczipstr'], data['loczipstr'])
data['loczipstr'] = np.where(len(data['loczipstr']) == 4, "000" + data['loczipstr'], data['loczipstr'])
data['loczipstr'] = np.where(len(data['loczipstr']) == 5, "00" + data['loczipstr'], data['loczipstr'])
data['loczipstr'] = np.where(len(data['loczipstr']) == 6, "0" + data['loczipstr'], data['loczipstr'])
这些行已执行,但根本不会更改data['loczipstr']
。
注意:长度范围从3到6,因为四位数的邮政编码看起来像1023.0
,因此字符长度为6
答案 0 :(得分:3)
将df转换为str
,然后在str.zfill
长度7上使用向量化max
:
In [76]:
df['loczipstr'] = df['loczip'].astype(str).str.zfill(7)
df
Out[76]:
loczip loczipstr
0 111 0000111
1 11111 0011111
2 111111 0111111
3 1111111 1111111
4 11111111 11111111
答案 1 :(得分:1)
print data
loczip
0 111
1 11111
2 111111
3 1111111
4 11111111
data['loczipstr'] = data['loczip'].astype(str)
data.loc[data['loczipstr'].str.len() == 3, 'loczipstr'] = "0000" + data['loczipstr']
data.loc[data['loczipstr'].str.len() == 4, 'loczipstr'] = "000" + data['loczipstr']
data.loc[data['loczipstr'].str.len() == 5, 'loczipstr'] = "00" + data['loczipstr']
data.loc[data['loczipstr'].str.len() == 6, 'loczipstr'] = "0" + data['loczipstr']
print data
loczip loczipstr
0 111 0000111
1 11111 0011111
2 111111 0111111
3 1111111 1111111
4 11111111 11111111