所以,我有一个数据帧,其中有超过10 ^ 6行,我只是简单地将lat(度数min)转换为lat(仅度数)。然而,我的框架中有一些线条,它们有一个字符串“p-”,它在早期杀死了我的循环。我尝试了一些事情(下面)。
代码:
body{
background: #ecf0f1;
font-family: 'Open Sans', sans-serif;
}
header{
background-color: #2A2A36;
}
.bold{
font-weight: bold;
}
.loginInput input{
margin: 0 auto;
width: 150px
}
.dropdown-menu{
width: 200px;
height: 220px;
background:#1F2021;
opacity: 0.9;
}
.navbar-nav .nav-link{
color: #ecf0f1;
text-align: right;
}
.navbar-nav .nav-link:hover{
background:#d35400;
}
form{
margin: 0 auto;
}
.searchForm form{
height: 0px;
display: block;
}
*{
padding: 0;
margin: 0;
}
.navbar-nav .nav-item{
border-right: 1px solid #FFF;
}
.navbar-nav{
background:#2A2A36;
}
代码返回此错误:
import pandas as pd
import numpy as np
import glob
import matplotlib.pyplot as plt
path = r'/home/engr/Documents/SchoolHR/Data/SFSU-Boat/SBE45m/2015/'
allfiles_list = glob.glob(path + "/15*.hex")
allfiles_list = sorted(allfiles_list)
col = ["temp", "conduct", "salinity", "lat", "lon", "hms", "dmy"]
big_frame = pd.DataFrame()
for name in allfiles_list:
df = pd.read_csv(name, skiprows=12, encoding="latin1", names=col, na_values=0, na_filter=False, engine="c")
big_frame = big_frame.append(df)
# TODO surgery on columns to convert to float for use on big_frame
# regex \D to remove any non-digit characters -- hms & dmy
big_frame["hms"].replace(regex=True,inplace=True,to_replace=r'\D',value=r'')
big_frame["dmy"].replace(regex=True,inplace=True,to_replace=r'\D',value=r'')
big_frame["temp"].replace(regex=True,inplace=True,to_replace='(\D.\=)',value='')
big_frame["conduct"].replace(regex=True,inplace=True,to_replace='(\D.\=)',value='')
big_frame["salinity"].replace(regex=True,inplace=True,to_replace='(\D.\=)',value='')
big_frame["lat"].replace(regex=True,inplace=True,to_replace='[lonat=]',value='')
big_frame["lon"].replace(regex=True,inplace=True,to_replace='[lonat=]',value='')
for index, row in big_frame.iterrows():
if row.lat[-1] == 'N':
D = float(row.lat[1:3])
M = float(row.lat[4:10])
DD = D + float(M/60)
row.lat = DD
if row.lon[-1] == 'W':
D1 = float(row.lon[1:4])
M1 = float(row.lon[5:12])
DD1 = D1 + float(M1/60)
row.lon = -DD1
我尝试通过执行此操作并在数据框上运行循环来修改代码:
ValueError: could not convert string to float: 'p-'
但我只是收到了这个:
big_frame['lon'] = big_frame.lon.str.replace('p-?' , '')
big_frame['lat'] = big_frame.lat.str.replace('p-?' , '')
big_frame["lat"].replace(regex=True,inplace=True,to_replace='[)]',value='')
big_frame["lon"].replace(regex=True,inplace=True,to_replace='[)]',value='')
以下示例数据集:
IndexError: string index out of range
答案 0 :(得分:0)
您可以使用以下内容删除有问题的行:
big_frame =big_frame[big_frame['col_name'].apply(lambda x: x.isdigit())]
然后行动不应该失败。