如何通过数据框中的名称将功能应用于特定列

时间:2019-08-12 15:41:25

标签: python pandas

我有一个数据框,其中的列包含GPS坐标。我想将以秒为单位的列转换为以十进制为单位的度。例如,我有一个名为“ lat_sec”和“ long_sec”的2列,其格式设置为186780.8954N。我试图编写一个函数,该函数将字符串中的最后一个字符保存为方向,将其数字部分除以得到小数位数,然后将两者串联在一起以形成新格式。然后,我尝试通过数据框中的名称查找该列并将其应用到该函数。

python的新手,因此无法找到其他资源。我认为我没有正确创建函数。我之内有“坐标”一词,因为我不知道该怎么称呼我正在分解的价值。 我的数据如下:

long_sec
635912.9277W
555057.2000W
581375.9850W
581166.2780W


df = pd.DataFrame(my_array)

def convertDec(coordinate):
    decimal = float(coordinate[:-1]/3600)
    direction = coordinate[-1:]
    return str(decimal) + str(direction)

df['lat_sec'] = df['lat_sec'].apply(lambda x: x.convertDec())

My error looks like this:
Traceback (most recent call last):
  File "code.py", line 44, in <module>
    df['lat_sec'] = df['lat_sec'].apply(lambda x: x.convertDec())
  File "C:\Python\Python37\lib\site-packages\pandas\core\frame.py", line 2917, in __getitem__
    indexer = self.columns.get_loc(key)
  File "C:\Python\Python37\lib\site-packages\pandas\core\indexes\base.py", line 2604, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas\_libs\index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 129, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index_class_helper.pxi", line 91, in pandas._libs.index.Int64Engine._check_type
KeyError: 'lat_sec'

2 个答案:

答案 0 :(得分:0)

通过执行float(coordinate[:-1]/3600),您将str除以int,这是不可能的,您可以做的就是将str转换成float而不是除法整数3600来获得float的输出。

第二,您没有正确使用apply,并且没有向其应用函数的lat_sec

import pandas as pd

df = pd.DataFrame(['635912.9277W','555057.2000W','581375.9850W','581166.2780W'],columns=['long_sec'])

#function creation
def convertDec(coordinate):
    decimal = float(coordinate[:-1])/3600
    direction = coordinate[-1:]
    return str(decimal) + str(direction)

#if you just want to update the existing column 
df['long_sec'] = df.apply(lambda row: convertDec(row['long_sec']), axis=1)

#if you want to create a new column, just change to the name that you want
df['lat_sec'] = df.apply(lambda row: convertDec(row['long_sec']), axis=1) 

#OUTPUT
    long_sec
0   176.64247991666667W
1   154.18255555555555W
2   161.49332916666665W
3   161.43507722222225W

如果您不想以float形式输出,而以整数形式,只需将float(coordinate[:-1])/3600更改为int(float(coordinate[:-1])/3600)

答案 1 :(得分:-1)

在上面的代码中,在convertDec方法内部,:

decimal = float(coordinate[:-1]/3600)

您需要先将coordinate转换为浮点数,然后再除以3600。

因此,您上面的代码应如下所示:

import pandas as pd

# Your example dataset
dictCoordinates = {
    "long_sec" : ["111111.1111W", "222222.2222W", "333333.3333W", "444444.4444W"],
    "lat_sec"  : ["555555.5555N", "666666.6666N", "777777.7777N", "888888.8888N"]
}

# Insert your dataset into Pandas DataFrame
df = pd.DataFrame(data = dictCoordinates)

# Your conversion method here
def convertDec(coordinate):
    decimal = float(coordinate[:-1]) / 3600 # Eliminate last character, then convert to float, then divide it with 3600
    decimal = format(decimal, ".4f") # To make sure the output has 4 digits after decimal point
    direction = coordinate[-1] # Extract direction (N or W) from content
    return str(decimal) + direction # Return your desired output

# Do the conversion for your "long_sec"
df["long_sec"] = df.apply(lambda x : convertDec(x["long_sec"]), axis = 1)

# Do the conversion for your "lat_sec"
df["lat_sec"] = df.apply(lambda x : convertDec(x["lat_sec"]), axis = 1)

print(df)

就是这样。希望这会有所帮助。