我有两个具有不同列名的数据框。我想创建一个新的数据框,其列名是两个数据框列的串联。结果行数将是两个数据集的行之间所有可能的组合(n_rows选择2)。
data$countryname= as.character(data$countryname)
output$top10countries <-renderChart({
topcountries <-
arrange(data%>%
group_by(as.character(countryname)) %>%
summarise(
Collective_Turnover= sum(as.numeric(`Net turnover`))
), desc(Collective_Turnover))
colnames(topcountries )[colnames(topcountries )=="as.character(countryname)"] <- "Country"
topcountries <- subset(topcountries [1:10,], select = c(Country, Collective_Turnover))
p <- nPlot(Collective_Turnover~Country, data = topcountries , type = "discreteBarChart", dom = "top10countries")
p$params$width <- 1000
p$params$height <- 200
p$xAxis(staggerLabels = TRUE)
# p$yAxis(axisLabel = "CollectiveTO", width = 50)
return(p)
})
将生成
df1 = pd.DataFrame({'A': ['1', '2']})
df2 = pd.DataFrame({'B': ['a', 'b', 'c']})
答案 0 :(得分:3)
import itertools
pd.DataFrame(list(itertools.product(df1.A,df2.B)),columns=['A','B'])
A B
0 1 a
1 1 b
2 1 c
3 2 a
4 2 b
5 2 c
答案 1 :(得分:0)
product()
函数将执行您想要的操作:
pd.DataFrame(list(itertools.product(df1.A,df2.B)),columns=['A','B'])
product()
的定义:
def product(*args, repeat=1):
# product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
# product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
pools = [tuple(pool) for pool in args] * repeat
result = [[]]
for pool in pools:
result = [x+[y] for x in result for y in pool]
for prod in result:
yield tuple(prod)
答案 2 :(得分:0)
您可以使用pd.MultiIndex
:
(pd.DataFrame(index=pd.MultiIndex.from_product([df1['A'], df2['B']],
names=['A','B']))
.reset_index())
输出:
A B
0 1 a
1 1 b
2 1 c
3 2 a
4 2 b
5 2 c