我正在尝试使用Panadas将csv文件读入Jupyter笔记本。当我读取文件并索引列时,我收到一条消息,内容为
索引(['&#;;#DOCTYPE HTML>'],dtype =' object')
我不确定为什么我的文件被作为html文档类型阅读,我无法阅读其当前格式的任何列。当我将文件转换为excel时,我也会遇到错误。任何人都可以指出我的问题可能是什么?谢谢。
import numpy as np
import pandas as pd
inspection = pd.read_csv("http://localhost:8889/view/Desktop/python/Data/Inspections_MergedFile.csv", sep='\t')
inspection.columns
csv数据来自纽约市餐厅检查的公开数据文件:https://data.cityofnewyork.us/Health/DOHMH-New-York-City-Restaurant-Inspection-Results/xx67-kt59,其中附有311的食物中毒数据https://data.cityofnewyork.us/Social-Services/food-poisoning/gjkf-etq5。
答案 0 :(得分:1)
如果您尝试通过某种网络API途径在本地提供文件,那么您必须提供有关您的应用的更多信息以及您已经制定的构造。
当我尝试以下关闭您提供的链接并复制其csv格式导出选项的链接时,我会下载数据(最终在几分钟后),尽管有警告......
>>> df = pandas.read_csv( 'https://data.cityofnewyork.us/api/views/xx67-kt59/rows.csv' )
sys:1: DtypeWarning: Columns (6) have mixed types. Specify dtype option on import or set low_memory=False.
>>> df
CAMIS DBA BORO BUILDING \
0 41471806 THE HEN HOUSE BROOKLYN 7302
1 50060020 CURRY EXPRESS NY MANHATTAN 130
2 50060627 RED HOUSE ASIAN FUSION QUEENS 19203
3 50040866 FUEL GRILL MANHATTAN 112
4 41710571 BLACKTHORN 51 QUEENS 8012
5 50015486 THE IZAKAYA MANHATTAN 326
6 50015250 PETITE BLUE DOG CAFE MANHATTAN 119
7 40388091 MASAWA MANHATTAN 1239
8 41456998 A.I.G.CHARTIS MANHATTAN 175
9 50006741 GRACE CAFE MANHATTAN 572
10 41377069 CATALDO'S RESTAURANT BROOKLYN 554
11 41145911 WA LUNG KITCHEN MANHATTAN 557
12 41547536 MINT'S THAI KITCHEN QUEENS 7015
13 41066771 DUNKIN' DONUTS BROOKLYN 5702
14 40365472 SPAIN RESTAURANT & BAR MANHATTAN 113
15 50072117 NaN MANHATTAN 307
16 50042671 EDGAR'S CAFE MANHATTAN 650
17 41490991 LIPS RESTAURANT MANHATTAN 227
18 41713624 BIENVENIDOS AL CALLAO RESTAURANT QUEENS 11122
19 40923012 DOMINO'S MANHATTAN 200
20 41477406 CIBAO RESTAURANT QUEENS 10422
21 50013522 BREWKLYN GRIND COFFEE BROOKLYN 557
22 41212364 BECKETT'S MANHATTAN 81
23 50066646 TOKOYO EXPRESS QUEENS 7057
24 41575815 BLACKOUT LOUNGE QUEENS 13316
...