如果另一列上的值相同,则将pandas列的所有值更改为首次出现

时间:2019-07-19 19:29:36

标签: python pandas date

如果文本列相同,我想将日期列中的所有日期更改为最早的日期。

import pandas as pd
df = pd.DataFrame({'text': ['I like python pandas', 
                                 'find all function input from help jupyter',
                                 'function input',
                           'function input',
                            'function input'],'date': ['March 1st',"March 2nd","March 3rd","March 4th","March 5th"]})

所以3月4日和3月5日,我想更改为3月3日,因为这是最早在文本列中列出“函数输入”的情况。任何帮助将不胜感激。

3 个答案:

答案 0 :(得分:1)

您可以按text分组,然后将结果与原始文件合并。像这样:

new_df = df.set_index('text').join(df.groupby('text').first(), lsuffix='_old')

然后print(new_df)显示:

                                            date_old       date
text                                                           
I like python pandas                       March 1st  March 1st
find all function input from help jupyter  March 2nd  March 2nd
function input                             March 3rd  March 3rd
function input                             March 4th  March 3rd
function input                             March 5th  March 3rd

答案 1 :(得分:1)

您可以做到:

def update_col(col):
    col[:] = col.iloc[0]
    return col

df['date'] = df.groupby('text').date.apply(update_col)
df
#                                        text       date
# 0                       I like python pandas  March 1st
# 1  find all function input from help jupyter  March 2nd
# 2                             function input  March 3rd
# 3                             function input  March 3rd
# 4                             function input  March 3rd

答案 2 :(得分:1)

如何?

df1 = df.drop_duplicates(['text'], keep = 'first')
del df['date']
df2 = pd.merge(df, df1, how = 'left', on = ['text'])

输出:

                                        text       date
0                       I like python pandas  March 1st
1  find all function input from help jupyter  March 2nd
2                             function input  March 3rd
3                             function input  March 3rd
4                             function input  March 3rd