如何从数据框内的范围列出列表?

时间:2018-06-29 01:36:59

标签: python pandas dataframe

我正在尝试根据范围内的数据框创建列表。

这是我的字符串列:

df['ID'] =['' ,'2','4', '','8', '','16-18','25', '30-31']
#spaces with no values represent null

我想创建这样的输出:

df['ID'] = [' ', 'ID 2', 'ID 4', 'ID 8',' ', ['ID 16','ID 17', 'ID 18'],
              'ID 25',['ID 30','ID 31']] 

有人可以帮忙吗?

2 个答案:

答案 0 :(得分:0)

IIUC

df.ID.str.split('-').apply(lambda x : x[0] if len(x)<=1 else list(range(int(x[0]),int(x[1])+1)))
Out[182]: 
0                
1               2
2               4
3                
4               8
5                
6    [16, 17, 18]
7              25
8        [30, 31]
Name: ID, dtype: object

答案 1 :(得分:0)

设置

df = pd.DataFrame()
df['ID'] =[ np.nan, 2,4, np.nan,8, np.nan ,'16-18',25, '30-31']

然后首先为范围构建"ID"

s = df.ID.str.split("-")
s2 = s[s.notna()].apply(lambda x: ("ID "+pd.Series(list(range(int(x[0]), int(x[1])+1))).astype(str)).tolist())

,然后是常规情况(NaN和范围除外)

non_na = df.ID[df.ID.notna()]
non_na_range = non_na[~non_na.index.isin(s2.index)]
s3 = "ID " + non_na_range.astype(str)

然后分配

df.loc[s2.index, "ID"] = s2
df.loc[s3.index, "ID"] = s3

输出

    ID
0   NaN
1   ID 2
2   ID 4
3   NaN
4   ID 8
5   NaN
6   [ID 16, ID 17, ID 18]
7   ID 25
8   [ID 30, ID 31]