我想总结一下经理,所有员工以及所有员工的所有strtotime($commenttime)
列,依此类推。
我查看了networkx库,因为它看起来像我可以做到的,但是我只能找出如何计算所有子雇员,而不是有条件地。
我确实尝试过将DataFrame拆分为has_filed,然后还没有使用networkx进行计数,但是这破坏了关系,因此人们失踪了。
这是一个示例数据框。
WITH CTE (COMMON,DayMinus)
AS
(
SELECT 1,0 UNION ALL
SELECT 1,1 UNION ALL
SELECT 1,2 UNION ALL
SELECT 1,3 UNION ALL
SELECT 1,4 UNION ALL
SELECT 1,5 UNION ALL
SELECT 1,6 UNION ALL
SELECT 1,7 UNION ALL
SELECT 1,8 UNION ALL
SELECT 1,9
)
SELECT YEAR(your_date_column) YR,
DATEPART(ISO_WEEK, your_date_column) WK,
*
FROM your_table
WHERE YEAR(your_date_column) IN (2019,2018)
AND DATEPART(ISO_WEEK, your_date_column) IN
(
SELECT A.WKNUM-CTE.DayMinus AS [WEEK NUMBER]
FROM CTE
INNER JOIN (
SELECT 1 AS COMMON,DATENAME(ISO_WEEK,GETDATE()) WKNUM
) A ON CTE.COMMON = A.COMMON
)
我希望输出看起来像下面的样子,下面的代码仅仅是我创建用来演示输出的数据框。
has_filed_paperwork
答案 0 :(得分:0)
import networkx as nx
G = nx.DiGraph()
# Iterate through the dataframe
for index, row in df.iterrows():
# Create a node with 'has_filled' attribute
G.add_node(row['emp_id'], has_filled=row['has_filled'])
# If manager is not np.nan, create an edge to a manager
if type(row['manager_id']).__name__ == 'str':
G.add_edge(row['emp_id'], row['manager_id'])
result_dict = {
'emp_id': [],
'has_filled_count': [],
'has_not_filled_count': []
}
# Iterate through graph nodes
for n in G.nodes():
# Get 'has_filled' attribute for all ancestors+current_node
# We can do it because our graph is a tree, and tree is a subclass of DAG
counted = [G.nodes[anc]['has_filled'] for anc in nx.ancestors(G, n) | {n}]
# Fill the dict
result_dict['emp_id'].append(n)
result_dict['has_filled_count'].append(counted.count(True))
result_dict['has_not_filled_count'].append(counted.count(False))
# And convert it to the dataframe
df_o = pd.DataFrame(result_dict)
df_o
emp_id has_filled_count has_not_filled_count
0 1 0 1
1 5 2 1
2 2 1 0
3 3 1 0
4 8 3 1
5 4 1 0
6 7 3 2
7 6 0 1
8 9 7 3
9 10 1 0