我有一个如下的python字典:
data = [{"cust_decision": "buy", "cust_details": "Easy to use"}, {"cust_decision": "buy", "cust_details": "econoimical"}, {"cust_decision":"no buy", "cust_details": "Didn’t like Product"}]
我正在根据以下数据创建pyspark df和temp视图:
from pyspark.sql import SparkSession, Row
spark.createDataFrame([Row(**i) for i in data]).createOrReplaceTempView("cust")
现在,当我看到此临时视图的数据时,特殊字符'(这不是单引号,它是)变成了另一个字符â。以下是结果
spark.table("cust").show(10,False)
+-------------+---------------------+
|cust_decision|cust_details |
+-------------+---------------------+
|buy |Easy to use |
|buy |econoimical |
|no buy |Didn’t like Product|
+-------------+---------------------+
但是我想按每个值获取字符。我该如何实现? 预期结果如下:
+-------------+---------------------+
|cust_decision|cust_details |
+-------------+---------------------+
|buy |Easy to use |
|buy |econoimical |
|no buy |Didn’t like Product |
+-------------+---------------------+
谢谢..
答案 0 :(得分:1)
尝试通过 df$z <- ifelse(df$y=='blank', 0, 1)
将您的数据字典访问 decoding
utf-8