如何将多个列合并到Pandas中的一个变量(列表)中

时间:2018-04-06 19:18:00

标签: python pandas variables dataframe merge

我有以下字符串列。

Str1    Str2    Str3
OK      I       Go
Yes             Hm
Fine    I see

我想将它们组合成一个列表变量

AllStr
["ok", "I", "Go"]
["Yes", "Hm"]
["Fine", "I see"]

我尝试了不同的方法,我接近但不太正确:

df_manual_label['AllStr'] = df_manual_label[['Str1', 'Str2', 'Str3']].apply(lambda x: ', '.join(x.astype(str)), axis=1)

1 个答案:

答案 0 :(得分:3)

您可以使用df['New']=df.replace('',np.nan).stack().groupby(level=0).apply(list) df Out[1666]: Str1 Str2 Str3 New 0 OK I Go [OK, I, Go] 1 Yes Hm [Yes, Hm] 2 Fine I see [Fine, I see]

import pandas as pd
%matplotlib inline

fig, axes =plt.subplots(nrows=1, ncols=3, sharex=False, sharey=True, figsize=(5,5))

aggregators = {'ACCOUNT':'count', 'cases': 'sum'}
variables = ['PAPERLESS', 'More PAPERLESS', 'PAPEERLESS NOT'] #For example
'''
# One way to get all the variables
variables = list(df.columns)
variables.remove('ACCOUNT')
variables.remove('cases')
'''

for variable, ax in zip(variables, axes):
    mid = df.groupby(variable)['ACCOUNT', 'cases'].agg(aggregators) #Map a function to each column
    percts = (mid.ACCOUNT / mid.cases) * 100 #Return a pd.Series with the percentages since you only plot that anyways
    percts.plot(kind='bar', ax=ax) #Only plot percentage
    ax.set_title(variable)