我有一个名为“天气”的列,我想分成多列
(degrees, humidity, wind_mph, wind_chill)
它看起来像这样:
有时它会很潮湿,有时会出现风寒,有时甚至不会有一个。
'81 degrees, wind 8 mph' .
'40 degrees, relative humidity 75%, wind 17 mph' .
'52 degrees, wind 12 mph'
'51 degrees, relative humidity 82%, wind 6 mph, wind chill 0'
我要拆分,以便在NULL
处拆分时不会出现风寒或湿气。
我该怎么做?
答案 0 :(得分:0)
这应该为您工作。基本上,您可以使用str.extract提取所需的列。
import pandas as pd
weather = ['81 degrees, wind 8 mph', '40 degrees, relative humidity 75%, wind 17 mph','52 degrees, wind 12 mph', '51 degrees, relative humidity 82%, wind 6 mph, wind chill 0']
df = pd.DataFrame(weather, columns = ['weather'])
df.head()
df['degrees'] = df.weather.str.extract(r'(\d+)\s*degrees',expand = True)
df['humidity'] = df.weather.str.extract(r'humidity\s*(\d+)%',expand = True)
df['wind_mph'] = df.weather.str.extract(r'wind\s*(\d+)\s*mph',expand = True)
df['wind_chill'] = df.weather.str.extract(r'wind\s*chill\s*(\d+)',expand = True)
答案 1 :(得分:0)
pd.concat(
[df,
df['ColName'].str.extract(r'(?P<degrees>.*degrees).*(?P<wind_mph>wind.*mph)', expand = True),
df['ColName'].str.extract(r', (?P<humidity>.*humidity.*%)'),
df['ColName'].str.extract(r'.*(?P<wind_chill>wind chill .*)'),
],
axis = 1)
您可以使用正则表达式进行一系列提取,并将它们重新组合回原始df。将'ColName'
替换为实际列的名称。