numeric_cols = ['temp', 'windchill', 'dewpoint', 'humidity', 'pressure', 'visibility', 'wind_speed', 'gust_speed', 'precip']
weather_list[numeric_cols] = weather_list[numeric_cols].apply(lambda x: re.sub('[^0-9]', '', str(x)))
weather_list[numeric_cols] = pd.to_numeric(weather_list[numeric_cols], errors='coerce')
weather_list[numeric_cols] = weather_list[numeric_cols] / 10
我试图对数据集进行一些清理但是我遇到了形状不匹配错误。由于错误,它说我匹配30列,0行,9列,30行......我明显做错了!我以前曾经使用过这种方法几次但从来没有错过 - 任何人都对我做错了什么有任何建议?数据从html拉到pandas df' weather_list'。
time temp windchill dewpoint humidity pressure visibility \
0 12:53 AM 21.0 °F - 19.0 °F 92% 30.47 in 10.0 mi
1 1:53 AM 21.9 °F - 19.9 °F 92% 30.48 in 10.0 mi
2 2:53 AM 21.9 °F - 19.0 °F 89% 30.50 in 10.0 mi
3 3:53 AM 21.0 °F - 19.0 °F 92% 30.50 in 10.0 mi
4 4:53 AM 19.9 °F - 18.0 °F 92% 30.51 in 10.0 mi
5 5:53 AM 21.0 °F - 18.0 °F 88% 30.51 in 10.0 mi
wind_direction wind_speed gust_speed precip events conditions
0 Calm Calm - NaN NaN Clear
1 Calm Calm - NaN NaN Clear
2 Calm Calm - NaN NaN Clear
3 Calm Calm - NaN NaN Clear
4 Calm Calm - NaN NaN Clear
5 Calm Calm - NaN NaN Clear
谢谢!
答案 0 :(得分:2)
我们可以使用applymap将函数应用于多个列。请对您的代码进行以下更改。
## using applymap here
weather_list[numeric_cols] = weather_list[numeric_cols].applymap(lambda x: re.sub(r'[^0-9]', '', str(x)))
## now we pass series to pd.to_numeric instead of data frame
weather_list[numeric_cols] = weather_list[numeric_cols].apply(lambda x: pd.to_numeric(x, errors='coerce'))
weather_list[numeric_cols] = weather_list[numeric_cols] / 10