Spark数据框中的别名

时间:2017-04-23 15:26:12

标签: apache-spark apache-spark-sql

如果子查询中有多列,则如何定义别名。从下面的例子中我想在d

的输出中定义avg(high)的别名

例如:

class User(Base, DictSerializable):
    __tablename__ = 'users'
    __table_args__ = dict(schema='user')

    id = Column(types.Id, primary_key=True)
    # other fields


class UserPreferences(Base, DictSerializable):
    __tablename__ = 'user_preferences'
    __table_args__ = dict(schema='user')

    id = Column(types.Id, primary_key=True)
    user_id = Column(types.Id, ForeignKey(User.id))
    ignored_categories = Column(types.ARRAY(types.Number), default=[])
    # other fields

    user = relationship("User",
                        backref=backref("preferences", single_parent=True, cascade="all, delete-orphan",
                                        passive_deletes=True, uselist=False),
                        )

class Question(Base, DictSerializable):
    __tablename__ = 'questions'
    __table_args__ = dict(schema='question')

    id = Column(types.Id, primary_key=True)
    user_id = Column(types.Id, ForeignKey(User.id, ondelete="CASCADE", onupdate="CASCADE"))

    # other fields

    category_ids = Column(types.ARRAY(types.Integer))

    user = relationship("User", foreign_keys=user_id,
                        backref=backref("questions", order_by=id, single_parent=True, uselist=True,
                                        cascade="all, delete-orphan", passive_deletes=True)
                        )


question_categories = Table('question_categories', Base.metadata, 
    Column('question_id', types.Integer, ForeignKey(Question.id)), 
    Column('category_id', types.Integer, ForeignKey(Category.id)) )
Question.categories = relationship(Category, secondary=question_categories, backref=backref('questions'))

1 个答案:

答案 0 :(得分:6)

您可以使用withColumnRenamed

val d = c
   .select("date","high")
   .groupBy("date")
   .avg("high")
   .withColumnRenamed("avg(high)", "Average High")