用字典迭代列列表熊猫

时间:2021-07-08 20:05:21

标签: python pandas iteration

我在熊猫数据框中有一个列表:

EmailOperator(task_id='send_email',to='lee@gmail.com.com',subject="Daily Report 
Generated",html_content=""" <h1>Youreports are ready.</h1> """,files
['/usr/local/airflow/store_files_airflow/location_wise_profit_report.csv', 
'/usr/local/airflow/store_files_airflow/store_wise_profit_report.csv'], dag=dag)

每个列表都在每一行中。此外,这个列表在每一行中都有不同的长度。 我有一本字典:

0: [car, telephone]
1: [computer, beach, book, language]
2: [rice, bus, street]

在那之后我把字典弄平了

dict = {'car': 'transport',
'rice':'food'
'book':'reading'
}

我想遍历列表中的所有项目并创建此类列,

这是所需的输出

d = {val:key for key, lst in dict.items() for val in lst}

我试过了:

index col1  col2
    0: [car, telephone],transport
    1: [computer, beach, book, language], reading
    2: [rice, bus, street], food

但我明白

  df['col2'] = data_df['col1'].index.map(d)

2 个答案:

答案 0 :(得分:1)

您可以.explode然后使用字典进行翻译,然后再次分组:

示例数据:

import pandas as pd
data = {'id': {0: 1, 1: 2, 2: 3}, 'col': {0: ['car', 'telephone'], 1: ['computer', 'beach', 'book', 'language'], 2: ['rice', 'bus', 'street']}}
df = pd.DataFrame(data)

dct = {'car': 'transport', 'rice':'food', 'book':'reading'}

代码:

df2 = df.explode('col')
df2['col2'] = df2['col'].replace(dct)
df['col2'] = df2[~df2['col'].eq(df2['col2'])]['col2']

输出:

   id                                col       col2
0   1                   [car, telephone]  transport
1   2  [computer, beach, book, language]    reading
2   3                [rice, bus, street]       food

答案 1 :(得分:1)

您可以在自定义函数上使用 apply

import pandas as pd

df = pd.DataFrame([{'col1': ['car', 'telephone']}, {'col1': ['computer', 'beach', 'book', 'language']}, {'col1': ['rice', 'bus', 'street']}])


def get_col2(lst):
    d={'car': 'transport','rice':'food','book':'reading'}
    for k,v in d.items():
        if k in lst:
            return v
        
df['col2'] = df['col1'].apply(get_col2)

输出:

<头>
col1 col2
0 ['car', 'telephone'] 运输
1 ['computer', 'beach', 'book', 'language'] 阅读
2 ['rice', 'bus', 'street'] 食物