Python:在用户函数

时间:2017-05-29 07:08:47

标签: python function

我想在python中用户编写的函数中操作数据帧。当我在函数外部运行它时,操作代码工作正常。但是,当我把它放在函数中并运行它运行的函数时没有错误但不返回任何数据帧。我的代码如下所示:

def reshape(file):
  from IPython import get_ipython
  get_ipython().magic('reset -sf')

  #import packages
  import pandas as pd
  import datetime
  import calendar


  #define file path and import files
  path="X:/TEMP/"
  file_path =path+file
  df = pd.read_excel(file_path, "Sheet1", parse_dates=["Date"])
  #reshape data to panel
  melted = pd.melt(df,id_vars="Date", var_name="id", value_name="Market_Cap")
  melted["id"] = melted["id"].str.replace("id", "")
  melted.id = melted.id.astype(int)
  melted.reset_index(inplace=True, drop=True)

  id_to_string = pd.read_excel(file_path, "Sheet2")
  id_to_string = id_to_string.transpose()

  id_to_string.reset_index(level=0, inplace=True)
  id_to_string.rename(columns = {0: 'id'}, inplace=True)
  id_to_string.rename(columns = {"index": 'Ticker'}, inplace=True)

  merged = pd.merge(melted, id_to_string, how="left", on="id")
  merged = merged.sort(["Date","Market_Cap"], ascending=[1,0])

  merged["Rank"] = merged.groupby(["Date"])["Market_Cap"].rank(ascending=True)

  df = pd.read_excel(file_path, "hardcopy_return", parse_dates=["Date"])
  df = df.sort("Date", ascending=1)

  old = merged
  merged = pd.merge(old,df,  on=["Date", "id"])
  merged = merged.set_index("Date") 
  return merged
reshape("sample.xlsx")

此代码运行但不返回任何内容。我在def命令或调用函数时犯了错误吗?非常感谢任何帮助。

1 个答案:

答案 0 :(得分:1)

我认为这是用iPython或jupyter笔记本运行的? 它之前可能有用,因为内核会记住某些状态。在将某些东西变成单独的函数而不是直接的脚本之前,我做了restart kernel & run All

在代码本身上,我会分割代码的不同部分,因此测试单个部分变得更容易

进口

import pandas as pd
import datetime
import calendar

from IPython import get_ipython
get_ipython().magic('reset -sf')

阅读' Sheet1'

从第一张工作表中获取数据并进行第一次处理

def read_melted(file_path):
    df1 = pd.read_excel(file_path, sheetname='Sheet1', parse_date["Date"])
    melted = pd.melt(df,id_vars="Date", var_name="id", value_name="Market_Cap")
    melted.id = melted.id.astype(int)
    melted.reset_index(inplace=True, drop=True)
    return melted

阅读' Sheet2'

def read_id_to_spring(file_path):
    df2 = pd.read_excel(file_path, sheetname='Sheet2')
    id_to_string = id2.transpose()
    id_to_string.reset_index(level=0, inplace=True)
    id_to_string.rename(columns = {0: 'id'}, inplace=True)
    id_to_string.rename(columns = {"index": 'Ticker'}, inplace=True)
    return id_to_string

阅读' hardcopy_return'

def read_hardcopy_return(file_path):
    df = pd.read_excel(file_path, sheetname='hardcopy_return', parse_date["Date"])
    return df.sort("Date", ascending=1)

将它们绑在一起

def reshape(df1, df2, df_hardcopy_return):
    merged = pd.merge(df1, df2, how="left", on="id").sort(["Date","Market_Cap"], ascending=[1,0])
    merged["Rank"] = merged.groupby(["Date"])["Market_Cap"].rank(ascending=True)  # what does this line do?
    merged_all = pd.merge(merged,df_hardcopy_return,  on=["Date", "id"]).set_index("Date") 
    return merged_all

调用一切

path="X:/TEMP/"
file_path =path+file

df1 = read_melted(file_path)
df2 = read_id_to_spring(file_path)
df_hardcopy_return = read_hardcopy_return(file_path)
reshape(df1, df2, df_hardcopy_return)

唯一让我感到奇怪的是行merged["Rank"] = merged.groupby(["Date"])["Market_Cap"].rank(ascending=True)

read_excel sheetname

pandas.read_excel也有一个sheetname参数,您可以使用该参数只打开一次。有时读取excel文件可能会很慢,所以这也可能使它更快