我正在尝试制作一个数据框,其中包含每日库存数量的历史数据,以及它们各自的Nifty 50指数的上升和下降。 作为python的新手,我无法处理pandas数据帧和条件。
以下是我编写的代码,但它有很多问题:
df.index = data.index错误:ValueError:长度不匹配:预期的轴有0个元素,新值有248个元素
如果我在上面的行中注释掉我设置空数据帧的索引,代码就会运行并在最后给出一个空的Dataframe。
#setting default dates
end_date = date.today()
start_date = end_date - timedelta(365)
#Deriving the names of 50 stocks in Nifty 50 Index
nifty_50 = pd.read_html('https://en.wikipedia.org/wiki/NIFTY_50')
nifty50_symbols = nifty_50[1][1]
df = pd.DataFrame(columns = {'Advances','Declines','Adv_Volume','Dec_Volume'})
for x in nifty50_symbols:
data = nsepy.get_history(symbol = x, start=start_date, end=end_date)
sclose = data['Close']
sopen = data['Open']
svol = data['Volume']
## df.index = data.index
## for i in df.index: --- since df.index was commented out it's value was nill
for i in data.index:
if sclose > sopen:
df['Advances'] = df['Advances'] + 1
df['Adv_Volume'] = df['Adv_Volume'] + svol
elif sopen > sclose:
df['Declines'] = df['Declines'] + 1
df['Dec_Volume'] = df['Dec_Volume'] + svol
print(df.tail())
输出:
Empty DataFrame
Columns: [Dec_Volume, Declines, Advances, Adv_Volume]
Index: []
编辑:找到代码提供空数据帧的原因,因为df.index是nill,所以if语句从未被触发。当我将该部分更改为data.index if语句被触发时。但是现在我不知道如何使用IF语句,因为它给出了错误:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
EDIT2:在Akshay Nevrekar的帮助下更新了代码:最后仍然得到一个空的数据帧。另外,我必须将DF的索引设置为data.index中的日期,以便稍后我可以将Advances / declines与各自的日期联系起来。
#setting default dates
end_date = date.today()
start_date = end_date - timedelta(365)
#Deriving the names of 50 stocks in Nifty 50 Index
nifty_50 = pd.read_html('https://en.wikipedia.org/wiki/NIFTY_50')
nifty50_symbols = nifty_50[1][1]
df = pd.DataFrame(columns = {'Advances','Declines','Adv_Volume','Dec_Volume'})
for x in nifty50_symbols:
data = ns.get_history(symbol = x, start=start_date, end=end_date)
## sclose = data['Close']
## sopen = data['Open']
## svol = data['Volume']
## df.index = data.index
for i in data.index:
sclose=data.loc[i]['Close']
sopen=data.loc[i]['Open']
svol = data.loc[i]['Volume']
if sclose > sopen :
df['Advances'] = df['Advances'] + 1
df['Adv_Volume'] = df['Adv_Volume'] + svol
elif sopen > sclose :
df['Declines'] = df['Declines'] + 1
df['Dec_Volume'] = df['Dec_Volume'] + svol
print(df
)
答案 0 :(得分:0)
您正在创建一个空数据框。
df = pd.DataFrame(columns = {'Advances','Declines','Adv_Volume','Dec_Volume'})
所以df.index
将为空且
for i in df.index:
if sclose > sopen:
df['Advances'] = df['Advances'] + 1
df['Adv_Volume'] = df['Adv_Volume'] + svol
elif sopen > sclose:
df['Declines'] = df['Declines'] + 1
df['Dec_Volume'] = df['Dec_Volume'] + svol
您在 for循环之上将永远不会运行。这就是为什么你得到一个空的数据帧。
编辑:编辑后我建议您在for循环中声明sclose和sopen变量。您将整个列分配给变量而不是单个值。
for i in data.index:
sclose=data.iloc[i]['Close']
sopen=data.iloc[i]['Open']
svol=data.iloc[i]['Volume']
if sclose > sopen:
df.loc[i]['Advances'] = df.loc[i]['Advances'] + 1
df.loc[i]['Adv_Volume'] = df.loc[i]['Adv_Volume'] + svol
elif sopen > sclose:
df.loc[i]['Declines'] = df.loc[i]['Declines'] + 1
df.loc[i]['Dec_Volume'] = df.loc[i]['Dec_Volume'] + svol