我们将在AWS kinesis流中获取交易详情。我们正在读取某些时间点的记录数并处理它们并暂时休眠并再次读取上一个序列号。
我们使用的是使用星图的Python多处理池。 我们正在创建10个postgres数据库连接,并通过提供一个数据库连接来分配每个记录。 当同一成员有两个交易时,问题就出现了。
问题: process1 - > member1 - >读取1000为bal - >扣除300 - >更新bal为700 process2 - > member1 - >读取1000为bal - >扣除200 - >更新bal为800
所以最终的余额变为800.但它应该是500,
现有代码段:
def process_records(self):
processes = cpu_count()*5
engine_list = []
for p in range(processes):
engine_list.append(get_engine())
print("Initializing the pool with " + str(processes) + " processes")
pool = Pool(processes)
try:
while self.iterator:
print("Get records from kinesis")
response = self.client.get_records(ShardIterator=self.iterator)
records = response.get('Records')
print("Read " + str(len(records)) + " records")
print("Initializing " + str(processes) + " processes")
start = 0
chunk = 1
while start < len(records):
print("Processing next chunk " + str(chunk) + " of records")
arguments = zip(records[start:start+processes], engine_list)
pool.starmap(process_record, arguments)
start += processes
chunk += 1
if len(records):
record = records[-1]
print("Below is Sequence Number")
print(record['SequenceNumber'])
self.save_last_seq_num(engine_list[0],
record['SequenceNumber'],
record['ApproximateArrivalTimestamp'])
self.iterator = response['NextShardIterator']
print(response['MillisBehindLatest'])
sleep_time = 0.2
print("Sleeping for " + str(sleep_time) + " seconds")
time.sleep(sleep_time)
except Exception as e:
pool.close()
pool.join()
print(e)
sleep_time = 1
print("Sleeping for " + str(sleep_time) + " second")
time.sleep(1)
我怎样摆脱这个问题。有什么建议吗?