我正在尝试将两个数据帧相乘
Df1
Name|Key |100|101|102|103|104
Abb AB 2 6 10 5 1
Bcc BC 1 3 7 4 2
Abb AB 5 1 11 3 1
Bcc BC 7 1 4 5 0
Df2
Key_1|100|101|102|103|104
AB 10 2 1 5 1
BC 1 10 2 2 4
预期产量
Name|Key |100|101|102|103|104
Abb AB 20 12 10 25 1
Bcc BC 1 30 14 8 8
Abb AB 50 2 11 15 1
Bcc BC 7 10 8 10 0
我尝试将Df1分组,然后与Df2相乘,但是没有用 请帮助我解决该问题
答案 0 :(得分:4)
您可以将df2 rename
Key_1
设为Key
(类似于df1),然后在level=1
上设置索引和mul
df1.set_index(['Name','Key']).mul(df2.rename(columns={'Key_1':'Key'})
.set_index('Key'),level=1).reset_index()
或类似的
df1.set_index(['Name','Key']).mul(df2.set_index('Key_1')
.rename_axis('Key'),level=1).reset_index()
@QuangHoang正确指出,您也可以重命名:
df1.set_index(['Name','Key']).mul(df2.set_index('Key_1'),level=1).reset_index()
Name Key 100 101 102 103 104
0 Abb AB 20 12 10 25 1
1 Bcc BC 1 30 14 8 8
2 Abb AB 50 2 11 15 1
3 Bcc BC 7 10 8 10 0
答案 1 :(得分:3)
IIUC reindex_like
df1.set_index('Key',inplace=True)
df1=df1.mul(df2.set_index('Key_1').reindex_like(df1).values).fillna(df1)
Out[235]:
Name 100 101 102 103 104
Key
AB Abb 20.0 12.0 10.0 25.0 1.0
BC Bcc 1.0 30.0 14.0 8.0 8.0
AB Abb 50.0 2.0 11.0 15.0 1.0
BC Bcc 7.0 10.0 8.0 10.0 0.0
答案 2 :(得分:3)
我们也可以将DataFrame.merge
与pd.Index.difference
一起使用来选择列。
raw_data <- getURL("https://raw.githubusercontent.com/datasets/covid-19/master/time-series-19-
covid-combined.csv")
data <- read.csv(text = raw_data, stringsAsFactors = FALSE)
View(data)
Confirmed <- data[which(data$Date=="2020-03-18"),] %>%
group_by(Country.Region)%>%
summarise(Confirmed = sum(Confirmed)) %>%
arrange(-Confirmed)
View(Confirmed)
Deaths <- data[which(data$Date=="2020-03-18"),] %>%
group_by(Country.Region) %>%
summarise(Deaths = sum(Deaths)) %>%
arrange(-Deaths)
View(Deaths)
Recovered <- data[which(data$Date=="2020-03-18"),] %>%
group_by(Country.Region) %>%
summarise(Recovered = sum(Recovered)) %>%
arrange(-Recovered)
View(Recovered)
Total_Confirmed <- sum(Confirmed$Confirmed)
Total_Deaths <- sum(Deaths$Deaths)
Total_Recovered <- sum(Recovered$Recovered)
mul_cols = df1.columns.difference(['Name','Key'])
df1.assign(**df1[mul_cols].mul(df2.merge(df1[['Key']],
left_on = 'Key_1',
right_on = 'Key')[mul_cols]))