对于我的学士论文,我尝试使用与kafka的http连接发送机器数据(在本例中为使用python脚本发送的历史数据)。 我正在使用Windows系统上的docker中运行的融合平台。
我使用python脚本尝试将数据发送到REST代理。一开始,我得到了有关我可以解析的数据类型的错误响应。
import pandas as pd
import csv, os, json, requests, time, datetime, copy, sys
if len(sys.argv) > 1:
bgrfc_value = str(sys.argv[1])
else:
print("No arguments for bgrfc given, defaulting to 'false'")
bgrfc_value = 'false'
if len(sys.argv) > 2:
filePath = str(sys.argv[2])
else:
filePath = "path"
if len(sys.argv) > 3:
batchSize = int(float(str(sys.argv[3])))
else:
batchSize = 10
# Build skeleton JSON
basejson = {"message": {"meta" : "", "data": ""}}
#metajson = [{'meta_key' : 'sender', 'meta_value': 'OPCR'},
# {'meta_key' : 'receiver', 'meta_value': 'CAT'},
# {'meta_key' : 'message_type', 'meta_value': 'MA1SEK'},
# {'meta_key' : 'bgrfc', 'meta_value': bgrfc_value}]
#basejson['message']['meta'] = metajson
url = "http://127.0.0.1:8082/"
headers = {'Content-Type':'application/json','Accept':'application/json'}
def assign_timestamps(batch):
newtimestamps = []
oldtimestamps = []
# Batch timestamps to list, add 10 newly generated timestamps to a list
for item in batch['tag_tsp'].values.tolist():
newtimestamps.append(datetime.datetime.now())
oldtimestamps.append(datetime.datetime.strptime(str(item), "%Y%m%d%H%M%S.%f"))
# Sort old timestamps without sorting the original array to preserve variance
temp = copy.deepcopy(oldtimestamps)
temp.sort()
mrtimestamp = temp[0]
# Replicate variance of old timestamps into the new timestamps
for x in range(batchSize):
diff = mrtimestamp - oldtimestamps[x]
newtimestamps[x] = newtimestamps[x] - diff
newtimestamps[x] = newtimestamps[x].strftime("%Y%m%d%H%M%S.%f")[:-3]
# Switch old timestamps with new timestamps
batch['tag_tsp'] = newtimestamps
return batch
# Build and send JSON, wait for a sec
def build_json(batch):
assign_timestamps(batch)
batchlist = []
for index, row in batch.iterrows():
batchlist.append(row.to_dict())
basejson['message']['data'] = batchlist
print(basejson)
req = requests.post(url, json = json.loads(json.dumps(basejson)), headers = headers)
print(req.status_code)
time.sleep(1)
while(True):
df = pd.read_csv(filePath, sep=";", parse_dates=[2], decimal=",", usecols = ['SENSOR_ID', 'KEP_UTC_TIME', 'VALUE'], dtype={'SENSOR_ID': object})
df = df[::-1]
df.rename(columns={'SENSOR_ID' : 'ext_id', 'KEP_UTC_TIME' : 'tag_tsp', 'VALUE' : 'tag_value_int'}, inplace=True)
# Fill list with batches of 10 rows from the df
list_df = [df[ i:i + batchSize] for i in range(0, df.shape[0], batchSize)]
for batch in list_df:
build_json(batch)
脚本发送数据,但作为响应,我得到状态码500。
答案 0 :(得分:1)
您的标题值不正确。您需要设置Accept
和Content-type
两个标头,如下所示:
Accept: application/vnd.kafka.v2+json
Content-Type : application/vnd.kafka.json.v2+json
另外,数据的结构应如下:
{"records":[{"value":{<Put your json record here>}}]}
例如:
{"records":[{"value":{"foo":"bar"}}]}
答案 1 :(得分:0)
我认为您放入“值”中的数据必须是字符串。 这样的事情会起作用:
{"records":[{"value":"{'foo':'bar'}"}]}
如果在阅读主题时收到有趣的消息,请尝试使用base64编码对消息进行编码。原始的json字符串经过编码后应如下所示:
{"records":[{"value":"eyJmb28iOiJiYXIifQ=="}]}