将列表中元组形式的字符串sigle行拆分为多行(等于元组数)

时间:2016-06-16 05:40:51

标签: python list pandas dataframe tuples

我有一个DataFrame,它在一行中包含字符串元组及其id。

像:

id         words
223        [('flying bird','round place'),('blue sky','red rose')]
368        [('fairy tales','great day'),('show time','break free'),('noise free')]

我想:

id         words
223        [('flying bird','round place')]
223        [('blue sky','red rose')]
368        ['fairy tales','great day')]
368        [('show time','break free')]
368        [('noise free')]]

在python pandas数据帧中。

4 个答案:

答案 0 :(得分:1)

set_indexstack的另一种解决方案。最后一列words转换为元组的list,但如果元组只有一个元素,则需要添加,

df.set_index('id', inplace=True)
df = df.words.apply(pd.Series)
df = df.stack().reset_index(drop=True, level=1).reset_index(name='words')

df['words'] = df.words.apply(lambda x: [(x,)] if len(x) > 2 else [x] )
print (df)
    id                         words
0  223  [(flying bird, round place)]
1  223        [(blue sky, red rose)]
2  368    [(fairy tales, great day)]
3  368     [(show time, break free)]
4  368               [(noise free,)]

答案 1 :(得分:0)

d = {'id': [233, 368],
     'words': [[('flying bird','round place'),('blue sky','red rose')],
                [('fairy tales','great day'),('show time','break free'),('noise free')]]}

df = pd.DataFrame(d)
dfidtemp = df['id']
df = df['words'].apply(pd.Series, 1)
df.index = dfidtemp
rslt = df.stack()

想知道这是否是你想要的:

rslt
Out[123]: 
id    
233  0    (flying bird, round place)
     1          (blue sky, red rose)
368  0      (fairy tales, great day)
     1       (show time, break free)
     2                    noise free
dtype: object

答案 2 :(得分:0)

words=[]
ids = []
for i in df.index:
    words = words + df.words[i]
    ids = ids + [df.id[i]]*len(df.words[i])
df = pd.DataFrame({'words':words,'ids':ids})

答案 3 :(得分:0)

您还可以使用ast literal_evaluation来解析tuples中的strings

from ast import literal_eval as make_tuple
df = df.groupby('id')['words'].apply(lambda x: pd.Series(make_tuple(x.iloc[0])).apply(lambda x: [x] if isinstance(x, tuple) else [(x, )])).to_frame()

得到:

                              words
id                                 
223 0  [(flying bird, round place)]
    1        [(blue sky, red rose)]
368 0    [(fairy tales, great day)]
    1     [(show time, break free)]
    2               [(noise free,)]