我有以下熊猫系列:
Application
我想把状态缩写为大写。这是我最好的猜测:test_series = pd.Series(['canton, nc', 'leicester, nc', 'asheville, nc', 'candler, nc',
'marshall, nc', 'waynesville, nc', 'fletcher, nc',
'hendersonville, nc', 'old fort, nc', 'horse shoe, nc',
'black mountain, nc', 'maggie valley, nc', 'burnsville, nc',
'weaverville, nc', 'zirconia, nc', 'swannanoa, nc',
'hot springs, nc', 'arden, nc', 'east flat rock, nc', 'marion, nc',
'mars hill, nc', 'flat rock, nc', 'rutherfordton, nc', 'clyde, nc',
'saluda, nc', 'alexander, nc', 'fairview, nc', 'mill spring, nc',
'brevard, nc', 'mills river, nc', 'penrose, nc',
'pisgah forest, nc', 'barnardsville, nc', 'etowah, nc',
'travelers rest, sc', 'lake lure, nc', 'montreat, nc', 'dana, nc',
'greenville, sc', 'flag pond, tn', 'laurel park, nc'])
但我收到属性错误。这样做效率最高的是什么?
答案 0 :(得分:2)
首先选择没有最后2的所有值,并添加到str.upper
转换为大写的最后2个值:
test_series = test_series.str[:-2] + test_series.str[-2:].str.upper()
print (test_series.head())
0 canton, NC
1 leicester, NC
2 asheville, NC
3 candler, NC
4 marshall, NC
dtype: object
答案 1 :(得分:1)
为了获得更好的性能,请使用列表理解,数据集中没有NaN:
test_series = pd.Series([i[:-2] + i[-2:].upper() for i in test_series])
测试平等:
(test_series.str[:-2] + test_series.str[-2:].str.upper() == pd.Series([i[:-2] + i[-2:].upper() for i in test_series])).all()
True
时序:
%timeit test_series.str[:-2] + test_series.str[-2:].str.upper()
1000 loops, best of 3: 1.1 ms per loop
%timeit pd.Series([i[:-2] + i[-2:].upper() for i in test_series])
1000 loops, best of 3: 245 µs per loop
输出:
0 canton, NC
1 leicester, NC
2 asheville, NC
3 candler, NC
4 marshall, NC
5 waynesville, NC
6 fletcher, NC
7 hendersonville, NC
8 old fort, NC
9 horse shoe, NC
10 black mountain, NC
11 maggie valley, NC
12 burnsville, NC
13 weaverville, NC
14 zirconia, NC
15 swannanoa, NC
16 hot springs, NC
17 arden, NC
18 east flat rock, NC
19 marion, NC
20 mars hill, NC
21 flat rock, NC
22 rutherfordton, NC
23 clyde, NC
24 saluda, NC
25 alexander, NC
26 fairview, NC
27 mill spring, NC
28 brevard, NC
29 mills river, NC
30 penrose, NC
31 pisgah forest, NC
32 barnardsville, NC
33 etowah, NC
34 travelers rest, SC
35 lake lure, NC
36 montreat, NC
37 dana, NC
38 greenville, SC
39 flag pond, TN
40 laurel park, NC
dtype: object