我在excel中有数据如下
category size1 size2 size3
cat1 10 20 30
cat2 20 10 15
cat3 30 20 10
我想要两个报告/ excel输出如下
#1)
Category-sizetype-value
cat1 size1 10
cat1 size2 20
cat1 size3 30
cat2 size1 20
...
#2)
Category-size-value-value counts(i.e how many time specific size value appears)
cat1 size1 10 3 times
cat1 size2 20 2 times
cat1 size3 30 1 time
cat2 size1 20 4 times
... 我到目前为止编写的代码,感谢一些指针为什么pd.concat不能在这里工作?并且不能
import pandas as pd
path_to_file = 'C:\Users\Niru\Desktop\cat-sizes.xlsx'
xl = pd.ExcelFile(path_to_file)
print(xl.sheet_names)
df = xl.parse('Sheet1')
#print(df.head())
print(df.columns)
frames = []
for i in df.columns:
dfd = "df.loc[:,['Category','" +i+"']]"
frames.append(dfd)
print(pd.concat(frames))
答案 0 :(得分:1)
您的示例数据和输出让我感到困惑,但我想这就是您想要的。
#Q1:
df1=pd.melt(df, id_vars=['category'], value_vars=['size1','size2','size3'])
Out[66]:
category variable value
0 cat1 size1 10
1 cat2 size1 20
2 cat3 size1 30
3 cat1 size2 20
4 cat2 size2 10
5 cat3 size2 20
6 cat1 size3 30
7 cat2 size3 15
8 cat3 size3 10
#Q2:
df1['counts']=df1.groupby(['variable','value']).transform('count')
Out[69]:
category variable value counts
0 cat1 size1 10 1
1 cat2 size1 20 1
2 cat3 size1 30 1
3 cat1 size2 20 2
4 cat2 size2 10 1
5 cat3 size2 20 2
6 cat1 size3 30 1
7 cat2 size3 15 1
8 cat3 size3 10 1
或第二季
df1['counts']=df1.groupby(['variable']).transform('count')
Out[71]:
category variable value counts
0 cat1 size1 10 3
1 cat2 size1 20 3
2 cat3 size1 30 3
3 cat1 size2 20 3
4 cat2 size2 10 3
5 cat3 size2 20 3
6 cat1 size3 30 3
7 cat2 size3 15 3
8 cat3 size3 10 3