从数据框删除数字和Nan除外的文本

时间:2019-11-15 13:38:36

标签: python dataframe

我有数据框,其中包含五列(即来源,电池,温度,时间和距离),如所附图片所示

enter image description here

我想从每一列中删除文本并仅保留数字 其次,我需要删除行

中出现的Nan

例如,预期的输出将如下所示

enter image description here

这是我到目前为止编写的代码

enter code here
import pandas as pd
import json
import requests
import re
URL = 'https://wastemanagement.post-iot.lu/measurement/measurements?source=83512& pageSize=1000000000&dateFrom=2019-10-26&dateTo=2019-10-28'
req = requests.get(URL,auth=('xxx', 'xxx') )
text_data= req.text
json_dict= json.loads(text_data)
df = pd.DataFrame.from_dict(json_dict["measurements"])
cols_to_keep =['source','battery','c8y_TemperatureMeasurement','time','c8y_DistanceMeasurement']
df_final = df[cols_to_keep]
df_final = df_final.rename(columns={'c8y_TemperatureMeasurement': 'Temperature Or T','c8y_DistanceMeasurement':'Distance'})
for col in df_final:
 df_final[col] = [''.join(re.findall("\d*\.?\d+", item)) for item in df_final[col]]

1 个答案:

答案 0 :(得分:0)

删除文字的答案是

enter code here
from pandas.io.json import json_normalize
import requests
import pandas as pd

URL = 'https://nnn.com/measurement/measurements?source=83512&pageSize=1000000000&dateFrom=2019-10-26&dateTo=2019-10-28'
req = requests.get(URL,auth=('xxxx', 'xxxx') )
text_data= req.text
json_dict= json.loads(text_data)
df= json_normalize(json_dict['measurements'])
df = df_final.rename(columns={'source.id': 'source', 'battery.percent.value': 'battery', 'c8y_TemperatureMeasurement.T.value': 'Temperature Or T','c8y_DistanceMeasurement.distance.value':'Distance'})
 cols_to_keep =['source' ,'battery', 'Temperature Or T', 'time', 'Distance']
 df_final = df[cols_to_keep]