我如何为PySpark数据框创建一条线形图?

时间:2018-11-01 16:06:29

标签: python pandas pyspark pyspark-sql

我有一个包含三列的数据框,并且我正在尝试使用Seaborn库绘制线图,但它抛出一个错误,说'DataFrame' object has no attribute 'get'。这是我的测试数据框

Age variable    value
31  Overall 69.76751118
31  Potential   69.76751118
31  Growth  0
34  Overall 68.91176471
34  Potential   68.91176471
34  Growth  0
28  Overall 69.05803996
28  Potential   69.05803996
28  Growth  0.24643197

这是我在读取csv文件后使用seaborn线图进行的尝试

test = spark.read.csv("test.csv", inferSchema=True, header=True)
sns.lineplot(x = "Age", y = "value", hue = "variable", data = test)

我得到的错误是这个

AttributeError: 'DataFrame' object has no attribute 'get'

但是,当我将数据框转换为Pandas数据框并使用完全相同的Seaborn代码时,

test_df = test.toPandas()
sns.lineplot(x = "Age", y = "value", hue = "variable", data = test_df)

enter image description here

我在Spark数据框架上做错什么了吗?

1 个答案:

答案 0 :(得分:1)

尽管有很多相同的功能,但Spark数据框和pandas数据框在分配数据的位置和方式方面有所不同。

此步骤正确:

<form method="POST" enctype="multipart/form-data">
          {% csrf_token %}

          <input type="file" name='image' accept="image/*"
           id="id_image">
            <a href="#">
              <img src="{{ user.profile.image.url }}" 
               onclick="_upload('id_image')"> <!-- ← this is where the magic happens -->
            </a>

          <input type="file" name='image_two' accept="image/*" 
           id="id_image_two">
            <a href="#">
              <img src="{{ user.profile.image_two.url }}" 
               onclick="_upload('id_image_two')"> <!-- ← this is where the magic happens -->
            </a>

          <input type="file" name='image_three' accept="image/*"
           id="id_image_three">
            <a href="#">
              <img src="{{ user.profile.image_three.url }}"
               onclick="_upload('id_image_three')"> <!-- ← this is where the magic happens -->
            </a>
          <button type="submit" value="submit">
            Update Profile</button>
        </form>

<script>
function _upload(eltId){
  document.getElementById(eltId).click();
}
</script>

您始终需要先收集数据,然后才能将其用于seaborn(甚至matplotlib)绘图