我正在尝试使用带有kinesis的spark结构化流媒体。以下是我的代码。
val kinesis = spark
.readStream
.format("kinesis")
.option("streams", streamName)
.option("region", "us-east-1")
.option("initialPosition", "TRIM_HORIZON")
.option("endpointUrl", "kinesis.us-east-1.amazonaws.com")
.option("awsAccessKey", accessKey)
.option("awsSecretKey", secretKey)
.option("format", "json")
.option("inferSchema", "true")
.schema(schema)
.load
val query = kinesis.writeStream
.outputMode("append")
.format("console")
.start()
query.awaitTermination()
我还将accesskey和secret密钥导出为环境变量。但是当我执行它时。我有以下异常
ERROR StreamExecution: Query [id = 45b3e45a-6a3b-48c6-9183-8531f85ea537, runId = ab5bbfd6-d9f7-455c-8065-fa65de932914] terminated with error
com.amazonaws.services.kinesis.model.ResourceNotFoundException: Stream dev-rt-stream-ds-5 under account 438156723281 not found. (Service: AmazonKinesis; Status Code: 400; Error Code: ResourceNotFoundException; Request ID: f1b1384c-2528-e50f-a358-2e59f9ad3b45)
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1182)
以下代码返回正确的分片数。
val credentials = new BasicAWSCredentials(accessKey, secretKey)
val kinesisClient = new AmazonKinesisClient(credentials)
kinesisClient.setEndpoint("kinesis.us-east-1.amazonaws.com")
val numShards = kinesisClient.describeStream(streamName).getStreamDescription().getShards().size
println("num shards => " + numShards)