Question

我已被分配去调试prod中长时间运行的python脚本。这里的问题是脚本中没有打印或调试命令。想要检查是否有内部日志记录可让python维护其跟踪当前正在运行的命令的位置。

例如我当前的脚本如下。这里的位置有巨大的文件，并且每个文件都在运行该过程。该脚本运行了最近的14个小时，我无法找到当前正在运行的命令。因此，维护当前正在运行的命令的任何内部python日志都将对您有所帮助。我只需要日志文件目录的帮助，或者如何找到这样的日志文件目录。

    ...
# Read data
for fname in glob('<location>*'):
        df = pd.read_csv(fname,header=None,sep=',')
        #here needs to modify the trigger feature every time
        df.columns = [colnames]
        df = df[df.cps_count>0].replace(r'\s+',np.nan,regex=True).replace('\\N',np.nan)
        # df = df[df.cps_count>0].replace(r'\s+',np.nan,regex=True).replace('\\N',np.nan)
        df = pd.get_dummies(df,columns=['prim_ppt']).fillna(np.nan)
        cols_obj = df.columns[df.dtypes.eq('object')]
        df[cols_obj] = df[cols_obj].apply(pd.to_numeric, errors='coerce')
        #this = pd.concat([ids.reset_index()for i in np.setdiff1d(cols,df.columns):df[i] = 0,pd.DataFrame(scores)],axis=1)
        xref_ids = df['cust_xref_id']
        for i in np.setdiff1d(cols,df.columns):df[i] = 0
    #xre_id needs to be replaced
        #feature_importance also needs to be replaced
        feature_importance = model.predict(xgb.DMatrix(df[[i for i in cols]],df['res'],missing=np.nan),pred_contribs=True)
        combined=np.c_[feature_importance,xref_ids]
        df_result=pd.DataFrame(combined,columns=cols+['bias_term','cust_xref_id'])
        dfs.append(df_result)
final= pd.concat(dfs,axis=0)
#need to adjust for every model
df_result2=final[[colnames]]
#need to adjust for every model
df_rank=df_result2[[<somecolnames>]].rank(axis=1,method='first',numeric_only=None,na_option='keep',ascending=False,pct=False)
df_rank['cust_xref_id']=df_result2['cust_xref_id']
#drop the null cust_xref_id
df_rank=df_rank[df_rank['cust_xref_id'].notnull()]
df_rank['cust_xref_id']=df_rank['cust_xref_id'].astype('int')
#Data transformation
#transform from wide to long type
df_final=pd.melt(df_rank,id_vars=['cust_xref_id'],value_vars=[<somecolnames>])
df_final = df_final.sort_values(by=["cust_xref_id", "variable"])
#Note: output file has to be tab seperated -- according to Sahil from CSP core team
if not os.path.exists('<outputPath>'):
        os.makedirs('<outputPath')
df_final.to_csv('<outputfile>',index=None,header=None,sep='\t')

如何在python脚本中识别正在运行的命令

0 个答案: