Question

我试图按以下方式创建表，其中列表中的资产作为列追加到数据框：

基础CTRP EBAY ...... MPNGF

价格
红利
Five_year_dividend
pe_ratio
pegRatio
priceToBook price_to_sales
book_value
息税前利润
net_income
每股收益债务权益
threeYearAverageReturn

此刻，根据下面的代码，仅显示列表中的最后一个股权：

MPNGF基础知识

价格
红利
Five_year_dividend
pe_ratio
pegRatio
priceToBook price_to_sales
book_value
息税前利润
net_income
每股收益债务权益
threeYearAverageReturn

from yahoofinancials import YahooFinancials
import pandas as pd
import lxml
from lxml import html
import requests
import numpy as np
from datetime import datetime


def scrape_table(url):
    page = requests.get(url)
    tree = html.fromstring(page.content)
    table = tree.xpath('//table')
    assert len(table) == 1

    df = pd.read_html(lxml.etree.tostring(table[0], method='html'))[0]

    df = df.set_index(0)
    df = df.dropna()
    df = df.transpose()
    df = df.replace('-', '0')

    df[df.columns[0]] = pd.to_datetime(df[df.columns[0]])
    cols = list(df.columns)
    cols[0] = 'Date'
    df = df.set_axis(cols, axis='columns', inplace=False)

    numeric_columns = list(df.columns)[1::]
    df[numeric_columns] = df[numeric_columns].astype(np.float64)

    return df

ecommerce = ['CTRP', 'EBAY', 'GRUB', 'BABA', 'JD', 'EXPE', 'AMZN', 'BKNG', 'MPNGF']

price=[]
dividend=[]
five_year_dividend=[]
pe_ratio=[]
pegRatio=[]
priceToBook=[]
price_to_sales=[]
book_value=[]
ebit=[]
net_income=[]
EPS=[]
DebtEquity=[]
threeYearAverageReturn=[]

for i, symbol in enumerate(ecommerce):     
    yahoo_financials = YahooFinancials(symbol)
    balance_sheet_url = 'https://finance.yahoo.com/quote/' + symbol + '/balance-sheet?p=' + symbol
    df_balance_sheet = scrape_table(balance_sheet_url)
    df_balance_sheet_de = pd.DataFrame(df_balance_sheet, columns = ["Total Liabilities", "Total stockholders' equity"])
    j= df_balance_sheet_de.loc[[1]]   
    j['DebtEquity'] = j["Total Liabilities"]/j["Total stockholders' equity"]
    k= j.iloc[0]['DebtEquity']

    X = yahoo_financials.get_key_statistics_data()
    for d in X.values():
        PEG = d['pegRatio']
        PB = d['priceToBook']
        three_year_ave_return = d['threeYearAverageReturn']

    data = [['price', yahoo_financials.get_current_price()], ['dividend', yahoo_financials.get_dividend_yield()], ['five_year_dividend', yahoo_financials.get_five_yr_avg_div_yield()], ['pe_ratio', yahoo_financials.get_pe_ratio()], ['pegRatio', PEG], ['priceToBook', PB], ['price_to_sales', yahoo_financials.get_price_to_sales()], ['book_value', yahoo_financials.get_book_value()], ['ebit', yahoo_financials.get_ebit()], ['net_income', yahoo_financials.get_net_income()], ['EPS', yahoo_financials.get_earnings_per_share()], ['DebtEquity', mee], ['threeYearAverageReturn', three_year_ave_return]]
    data.append(symbol.text)
    df = pd.DataFrame(data, columns = ['Fundamentals', symbol])
    df

关于上表中我可能在哪里出错，请寻求您的有益建议？非常感谢你！

Answer 1

您需要在for循环之外调用df。当前编写的代码将为每个循环重新创建一个新的df。

如何将遍历列表的新结果附加到数据框的新列中

1 个答案: