如何将含有NaN的系列分成两个已定义元素的长度?

时间:2017-09-27 19:29:27

标签: python python-3.x pandas

我有一个Series对象:

0    1211.0
1    2214.0
2    1317.0
3       NaN
4       NaN
5     812.0
Name: Time, Length: 6, dtype: float64

我想从中获得两个单独的系列:

0        12
1        22
2        13
3       NaN
4       NaN
5         8
Name: Hours, Length: 6, dtype: float64

0        11
1        14
2        17
3       NaN
4       NaN
5        12
Name: Minutes, Length: 6, dtype: float64

我定义了两个函数:

def hours(col):
    hours = str(int(col)).strip()[:-2]
    return hours

def minutes(col):
    minutes = str(int(col)).strip()[-2:]
    return minutes

我想过这样的事情,但由于NaNs它确实起作用了:

hours = Time.apply(hours)
minutes = Time.apply(minutes)

如何使此功能按照我希望的方式工作?

3 个答案:

答案 0 :(得分:1)

一种方法是使用一些数学 -

series1, series2 = s//100, s-100*(s//100)
series1.name = 'Hours'
series2.name = 'Minutes'

示例运行 -

In [424]: s  # input series
Out[424]: 
0    1211.0
1    2214.0
2    1317.0
3       NaN
4       NaN
5     812.0
Name: Time, dtype: float64

In [425]: series1, series2 = s//100, s-100*(s//100)
     ...: series1.name = 'Hours'
     ...: series2.name = 'Minutes'
     ...: 

In [426]: series1
Out[426]: 
0    12.0
1    22.0
2    13.0
3     NaN
4     NaN
5     8.0
Name: Hours, dtype: float64

In [427]: series2
Out[427]: 
0    11.0
1    14.0
2    17.0
3     NaN
4     NaN
5    12.0
Name: Minutes, dtype: float64

答案 1 :(得分:0)

df.Time=df.Time.fillna('NANA').astype(str)
df['Hour']=df.Time.str[:-4]
df['Min']=df.Time.str[-4:-2]
df.replace({'NANA':np.nan,'NA':np.nan,'':np.nan})

Out[144]: 
     Time  Min Hour
0  1211.0   11   12
1  2214.0   14   22
2  1317.0   17   13
3     NaN  NaN  NaN
4     NaN  NaN  NaN
5   812.0   12    8

答案 2 :(得分:0)

两种选择: 使用时间,

time = pd.to_datetime(df['col'].astype(str).str.split('.').str[0], format = '%H%M')
series1 = time.dt.hour
series2 = time.dt.minute

series

0    12.0
1    22.0
2    13.0
3     NaN
4     NaN
5     8.0

series2

0    11.0
1    14.0
2    17.0
3     NaN
4     NaN
5    12.0

使用str方法

series1 = df['col'].astype(str).str.split('.').str[0].str[-2:].replace('an', np.nan)
series2 = df['col'].astype(str).str.split('.').str[0].str[-4:-2].replace('n', np.nan)

series1

0     11
1     14
2     17
3    NaN
4    NaN
5     12

series2

0     12
1     22
2     13
3    NaN
4    NaN
5      8