给出以下整洁的library(dplyr)
tibble(x=c(1,2,3)) %>%
mutate(y = runif(x))
# A tibble: 3 x 2
x y
<dbl> <dbl>
1 1 0.445
2 2 0.778
3 3 0.632
:
pd.DataFrame
我正在寻找一种透视import numpy as np
import pandas as pd
countries = np.tile(["A", "A", "A", "B", "B", "B", "C", "C", "C"], 2)
years = np.tile([2018, 2019, 2020], 6)
indicators = np.repeat(["ind_1", "ind_2"], 9)
vals = list(range(1, 19))
df = pd.DataFrame(data=[countries, years, indicators, vals]).T
df.columns = ["countries", "years", "indicators", "value"]
print(df)
countries years indicators value
0 A 2018 ind_1 1
1 A 2019 ind_1 2
2 A 2020 ind_1 3
3 B 2018 ind_1 4
4 B 2019 ind_1 5
5 B 2020 ind_1 6
6 C 2018 ind_1 7
7 C 2019 ind_1 8
8 C 2020 ind_1 9
9 A 2018 ind_2 10
10 A 2019 ind_2 11
11 A 2020 ind_2 12
12 B 2018 ind_2 13
13 B 2019 ind_2 14
14 B 2020 ind_2 15
15 C 2018 ind_2 16
16 C 2019 ind_2 17
17 C 2020 ind_2 18
的方式。列索引应具有两个级别:df
。也就是说,结果应如下所示:
["indicators", "years"]
下面的代码可以做到。但是,感觉有点奇怪。有没有更复杂的方法可以做到这一点?
indicators ind_1 ind_2
years 2018 2019 2020 2018 2019 2020
countries
A 1 2 3 10 11 12
B 4 5 6 13 14 15
C 7 8 9 16 17 18