我有以下数据框:
class ExampleSpider(BaseSpider):
name = "example"
start_urls = ["file:///path_of_directory/example.html"]
def parse(self, response):
print response
hxs = HtmlXPathSelector(response)
我尝试过:
print(inventory_df)
dt_op Prod_1 Prod_2 ... Prod_n
10/09/18 0 8 0
10/09/18 5 0 2
11/09/18 4 0 0
11/09/18 0 10 0
...
And I would like to get:
print(final_df)
dt_op Prod_1 Prod_2 ... Prod_n
10/09/18 5 8 2
11/09/18 4 10 0
...
但是不会产生所需的输出。如何创建final_df?
答案 0 :(得分:1)
您可以将pandas
groupby
函数与sum()
一起使用:
In [412]: inventory_df
Out[412]:
dt_op Prod_1 Prod_2
0 10/09/18 0 8
1 10/09/18 5 0
2 11/09/18 4 0
3 11/09/18 0 10
In [413]: inventory_df.groupby('dt_op').sum()
Out[413]:
Prod_1 Prod_2
dt_op
10/09/18 5 8
11/09/18 4 10
答案 1 :(得分:0)
仅模拟Stated DataFrame,您在各行中询问了groupby
+ sum()
。
复制的数据框:
>>> df
dt_op Prod_1 Prod_2 Prod_n
0 10/09/18 0 8 0
1 10/09/18 5 0 2
2 11/09/18 4 0 0
在groupby
列周围使用axis=1(of dimension 1, which is what used to be columns)
或仅在df.groupby('dt_op').sum
周围使用:
>>> df.groupby('dt_op').sum(axis=1)
Prod_1 Prod_2 Prod_n
dt_op
10/09/18 5 8 2
11/09/18 4 0 0
但是,您正在寻找跨列的行的文字sum():
>>> df['new_sum'] = df.sum(axis=1)
>>> df
dt_op Prod_1 Prod_2 Prod_n new_sum
0 10/09/18 0 8 0 8
1 10/09/18 5 0 2 7
2 11/09/18 4 0 0 4