我正在尝试将数据框转换为json字符串。我正在使用pyspark。
这是我正在使用的代码。
def produceTrainData (self,csvData): #Array[String] = {
trainData = csvData.withColumn("therapyClass", lit("REMODULIN"))\
.withColumn("patientAge", lit(52))\
.withColumn("patientSex", lit("M"))\
.withColumn("serviceType", lit("PHARMACY"))\
.withColumn("npiId", lit("27"))\
.withColumn("requestID", lit(419568891))\
.withColumn("requestDateTime", lit("20171909 21:30:55"))\
selectData = trainData.select("payorId", "patientId","therapyType","therapyClass","ndcNumber","procedureCode","patientAge","patientSex",
"placeOfService", "serviceDuration","daysOrUnits","charges", "serviceDate", "serviceType","serviceBranchId",
"npiId", "diagnosisCode", "authNbr","requestID", "requestDateTime")
authNbrFilter = col("authNbr") != "-"
filterData = selectData.where(authNbrFilter)#.limit(20)
print(filterData)
filterData.show(20,False)
jsons = filterData.toJSON
print(jsons)
有两个错误:
我很高兴知道错误的原因。
您知道此错误的原因是什么吗?