我有一列GPS地图与道路ID相匹配的点以及这些GPS点处的车辆速度,如何对每个道路ID的速度进行归一化?
ID LAT LONG SPEED
0 72.5896 72.893 60KM/H
0 73.5888 72.899 50KM/H
0 89.5366 78.822 60KM/H
1 99.8576 88.6115 50KM/H
1 120.1515 128.2225 30KM/H
1 -88.515 -51.5151 30KM/H
答案 0 :(得分:0)
在这里,您将使用熊猫和numpy:
= ^ .. ^ =
import pandas as pd
import numpy as np
from io import StringIO
# create raw data
raw_data = StringIO(
"""ID LAT LONG SPEED
0 72.5896 72.893 60KM/H
0 73.5888 72.899 50KM/H
0 89.5366 78.822 60KM/H
1 99.8576 88.6115 50KM/H
1 120.1515 128.2225 30KM/H
1 -88.515 -51.5151 30KM/H""")
# load data into data frame
df = pd.read_csv(raw_data, sep=' ')
# replace string
df['SPEED'] = df['SPEED'].str.replace(r'KM/H', '')
# normalize function
def normalize_data(i):
selected_data = df.loc[df['ID'] == i]
selected_data = selected_data.copy()
selected_data['normalized'] = (selected_data['SPEED'].astype(np.int) - selected_data['SPEED'].astype(np.int).min()) / (
selected_data['SPEED'].astype(np.int).max() - selected_data['SPEED'].astype(np.int).min())
return selected_data
# loop over ID numbers and normalize data
frames = []
for i in range(0, df['ID'].max()+1):
selected_data = normalize_data(i)
frames.append(selected_data)
# concat frames
result = pd.concat(frames)
输出:
ID LAT LONG SPEED normalized
0 0 72.5896 72.8930 60 1.0
1 0 73.5888 72.8990 50 0.0
2 0 89.5366 78.8220 60 1.0
3 1 99.8576 88.6115 50 1.0
4 1 120.1515 128.2225 30 0.0
5 1 -88.5150 -51.5151 30 0.0