将[[“”“:” 1“,” id“:” NOW.976818“ ....” cv“:” 1“}]等格式的下载字符串转换为Pd数据帧吗?

时间:2019-11-17 11:20:03

标签: python arrays pandas list

我将新闻内容列表下载到pandas数据框中。 pd不会将信息放入表中,而是将所有内容放入单个单元格中。经检查,下载的字符串采用以下格式:

"['[{"t": "1", "id": "NOW.976818", "dt": "2019/11/15 10:13", "h": "《美股業績》Nvidia季績勝預期 季度收入預測遜預期", "u": "",...

如何将其转换为pd表?

我的代码:

urlpull ="http://www.aastocks.com/tc/resources/datafeed/getmorenews.ashx?cat=result-announcement&newstime=942660890&newsid=NOW.976800&period=0&key="
df = pd.DataFrame({'News': ['a'], 'Page': ['1']})
result = requests.get(urlpull)
result.raise_for_status()
result.encoding = "utf-8"
src = result.content
soup = BeautifulSoup(src, 'lxml')

news = []
for a_tag in soup.find_all('p'):
    news.append(a_tag.text)
df = df.append(pd.DataFrame(news, columns=['News']))
print(news)
df['num'] = df['News'].str.extract('(\d{5})')
df["stock_num"] = pd.to_numeric(df["num"], errors="coerce").fillna(0).astype("int64")

print (df)
df.to_excel("News.xlsx")

1 个答案:

答案 0 :(得分:0)

您可以直接做

pd.read_table(filename/url)