Question

我想从PySpark中的整数中分离出我的df的字符类型，并对integertypes执行一些描述性分析。我写了这个函数，有更有效的方法吗？

for item in df.columns:
    if df.dtypes[item][1] =='string':
        print("this column is a string ")
    else:
        df.agg(F.min(df[item])).show()                                  
        max= df.agg(F.max(df[item]))                                  
        max.show()

Answer 1

要分析您可以使用的数据框架结构：

df.describe()
df.schema()
df.toPandas().info()
df.cube("col1"[, "col2"]).count().show()

最后但并非最不重要DataFrameStatFunctions

pyspark从integertypes中拆分stringtype进行探索性分析

1 个答案: