我有以下格式的CSV数据:
+-------------+-------------+-------+
| Location | Num of Reps | Sales |
+-------------+-------------+-------+
| 75894 | 3 | 12 |
| Burkbank | 2 | 19 |
| 75286 | 7 | 24 |
| Carson City | 4 | 13 |
| 27659 | 3 | 17 |
+-------------+-------------+-------+
Location
列属于object
数据类型。我想要做的是删除所有具有非数字位置标签的行。所以我想要的输出,如上表所示:
+----------+-------------+-------+
| Location | Num of Reps | Sales |
+----------+-------------+-------+
| 75894 | 3 | 12 |
| 75286 | 7 | 24 |
| 27659 | 3 | 17 |
+----------+-------------+-------+
现在,我可以通过以下方式对解决方案进行硬编码:
list1 = ['Carson City ', 'Burbank'];
df = df[~df['Location'].isin(['list1'])]
受到以下帖子的启发:
How to drop rows from pandas data frame that contains a particular string in a particular column?
但是,我正在寻找的是一般解决方案,适用于上述类型的任何表格。
答案 0 :(得分:5)
或者你可以做到
df[df['Location'].str.isnumeric()]
Location Num of Reps Sales 0 75894 3 12 2 75286 7 24 4 27659 3 17
答案 1 :(得分:3)
您可以pd.to_numeric
强制将非数字值强加给nan
,然后根据位置是nan
进行过滤:
df[pd.to_numeric(df.Location, errors='coerce').notnull()]
#Location Num of Reps Sales
#0 75894 3 12
#2 75286 7 24
#4 27659 3 17
答案 2 :(得分:1)
In [139]: df[~df.Location.str.contains('\D')]
Out[139]:
Location Num of Reps Sales
0 75894 3 12
2 75286 7 24
4 27659 3 17
答案 3 :(得分:0)
df[df['Location'].str.isdigit()]
Location Num of Reps Sales
0 75894 3 12
2 75286 7 24
4 27659 3 17