Question

我正在尝试将csv文件中的数据读入pandas数据帧，但是当读入数据帧时，标题会移过两列。

我认为它与标题后面有两个空白行有关，但我不确定。它似乎是在前两列中读取行标题/索引。

CSV格式：

VendorID,lpep_pickup_datetime,Lpep_dropoff_datetime,Store_and_fwd_flag,RateCodeID,Pickup_longitude,Pickup_latitude,Dropoff_longitude,Dropoff_latitude,Passenger_count,Trip_distance,Fare_amount,Extra,MTA_tax,Tip_amount,Tolls_amount,Ehail_fee,Total_amount,Payment_type,Trip_type 


2,2014-04-01 00:00:00,2014-04-01 14:24:20,N,1,0,0,0,0,1,7.45,23,0,0.5,0,0,,23.5,2,1,,
2,2014-04-01 00:00:00,2014-04-01 17:21:33,N,1,0,0,-73.987663269042969,40.780872344970703,1,8.95,31,1,0.5,0,0,,32.5,2,1,,

数据框格式：

                                   VendorID lpep_pickup_datetime  \
2 2014-04-01 00:00:00  2014-04-01 14:24:20                    N   
  2014-04-01 00:00:00  2014-04-01 17:21:33                    N   
  2014-04-01 00:00:00  2014-04-01 15:06:18                    N   
  2014-04-01 00:00:00  2014-04-01 08:09:27                    N   
  2014-04-01 00:00:00  2014-04-01 16:15:13                    N   

                       Lpep_dropoff_datetime  Store_and_fwd_flag  RateCodeID  \
2 2014-04-01 00:00:00                      1                   0           0   
  2014-04-01 00:00:00                      1                   0           0   
  2014-04-01 00:00:00                      1                   0           0   
  2014-04-01 00:00:00                      1                   0           0   
  2014-04-01 00:00:00                      1                   0           0

以下代码：

file ='green_tripdata_2014-04.csv'
df4 = pd.read_csv(file)
print(df4.head(5))

我只是需要它来读入数据框，标题位于正确的位置。

Answer 1

您的csv数据确实看起来很奇怪 - 您有20个列标题，但第一行中有22个条目包含数据。

假设这只是一个复制粘贴错误*，您可以尝试以下方法：

skiprows

index_col将跳过两个空行，,,可能会减轻数据被解释为索引列的影响。

有关csv解析器的所有选项，请参阅http://pandas.pydata.org/pandas-docs/version/0.16.2/generated/pandas.read_csv.html。

编辑：

*：如果您的数据与您发布的内容完全一致，那么您的csv格式不正确。您还有两个数据列（请参阅最后两个逗号pd.read_csv("file.csv", skiprows=[1,2], usecols=np.arange(20))）。

删除两个逗号时，解析器工作正常。

另一种选择是指定要使用的列：

np.arange(20)

在这里，public void swipeUpElement(AppiumDriver<WebElement> driver, WebElement element, int duration){ int bottomY = element.getLocation().getY()-200; driver.swipe(element.getLocation().getX(), element.getLocation().getY(), element.getLocation().getX(), bottomY, duration); }告诉解析器只解析1-20列，即具有有效标题的列（在第一行）。

当执行csv读取时，pandas数据帧头被移位

1 个答案:

编辑：