根据同一数据框各行中的值在数据框中创建新列

时间:2018-09-12 17:49:57

标签: python pandas dataframe

我试图用同一数据帧的行中的值创建一个新列,以使Year列中的单元格值与标题匹配,并将值放在Value下的行单元格中。请参阅所附图片。有什么建议吗?

Input and Output Dataframes

2 个答案:

答案 0 :(得分:1)

您可能需要找到df <- data.frame(group = c("a", "a", "b"), start = as.Date(c("2018-01-01", "2018-09-01", "2018-02-01")), end = as.Date(c("2018-02-15", "2018-12-31", "2018-03-30"))) library(tidyverse) library(lubridate) df %>% group_by(id = row_number()) %>% # for each row mutate(seq = list(seq(start, end, "day")), # create a sequence of dates with 1 day step month = map(seq, month)) %>% # get the month for each one of those dates in sequence unnest() %>% # unnest data group_by(group, id, month) %>% # for each group, row and month summarise(start = min(seq), # get minimum date as start end = max(seq)) %>% # get maximum date as end ungroup() %>% # ungroup select(-id, - month) # remove unecessary columns # # A tibble: 8 x 3 # group start end # <fct> <date> <date> # 1 a 2018-01-01 2018-01-31 # 2 a 2018-02-01 2018-02-15 # 3 a 2018-09-01 2018-09-30 # 4 a 2018-10-01 2018-10-31 # 5 a 2018-11-01 2018-11-30 # 6 a 2018-12-01 2018-12-31 # 7 b 2018-02-01 2018-02-28 # 8 b 2018-03-01 2018-03-30

的功能
lookup

答案 1 :(得分:0)

如果您知道索引中的所有年份也在您的列中,则可以尝试以下操作:

import pandas as pd
import numpy as np

df = pd.DataFrame(
    {2008: 5*[1], 2009: 5*[2], 2010: 5*[3], 2011: 5*[4], 2012: 5*[5]},
    index=[2008, 2011, 2010, 2009, 2012]
)

np.diag(df.loc[df.index, df.index])

结果:

array([1, 4, 3, 2, 5])