我正在从YAHOO抓取数据,并尝试使用另一个文件上的值从其数据框中提取某个值。
我设法抓取数据并将其显示为数据框。事情是我试图使用另一个df从数据中提取某个值。
这是我得到的CSV
df_earnings=pd.read_excel(r"C:Earnings to Update.xlsx",index_col=2)
stock_symbols = df_earnings.index
输出:
Date E Time Company Name
Stock Symbol
CALM 2019-04-01 Before The Open Cal-Maine Foods
CTRA 2019-04-01 Before The Open Contura Energy
NVGS 2019-04-01 Before The Open Navigator Holdings
ANGO 2019-04-02 Before The Open AngioDynamics
LW 2019-04-02 Before The Open Lamb Weston`
然后我使用Yahoo Finance的数据下载每只股票的csv:
driver.get(f'https://finance.yahoo.com/quote/{stock_symbol}/history?period1=0&period2=2597263000&interval=1d&filter=history&frequency=1d')
输出:
Open High Low ... Adj Close Volume Stock Name
Date ...
1996-12-12 1.81250 1.8125 1.68750 ... 0.743409 1984400 CALM
1996-12-13 1.71875 1.8125 1.65625 ... 0.777510 996800 CALM
1996-12-16 1.81250 1.8125 1.71875 ... 0.750229 122000 CALM
1996-12-17 1.75000 1.8125 1.75000 ... 0.774094 239200 CALM
1996-12-18 1.81250 1.8125 1.75000 ... 0.791151 216400 CALM
我的问题在这里,我不知道如何从数据框找到日期并将其从下载的文件中提取出来。
现在我不想插入这样的手动日期:
df = pd.DataFrame.from_csv(file_path)
df['Stock Name'] = stock_symbol
print(df.head())
df = df.reset_index()
print(df.loc[df['Date'] == '2019-04-01'])
输出:
Date Open High ... Adj Close Volume Stock Name
5610 2019-04-01 46.700001 47.0 ... 42.987827 846900 CALM
我想要一个条件,该条件将为每只股票运行我的数据框并提取所需日期
print(df.loc[df['Date'] == the date that is next to the symbol that i just downloaded the file for])
答案 0 :(得分:0)
我想您可以使用一个变量来保存日期。
for sy in stock_symbols:
# The value from the 'Date' column in df_earnings
dt = df_earnings.loc[df_earnings.index == sy, 'Date'][sy]
# From the second block of your code relating to 'manual' date
df = pd.DataFrame.from_csv(file_path)
df['Stock Name'] = sy
df = df.reset_index()
print(df.loc[df['Date'] == dt])