所以我在Python 3中进行编程,并且想使用pandas库数据框打印出数据集(csv文件)的维度,还做了一些其他我不太了解的想法?这只是一个例子,因为我只需要解释如何。 说我有两个功能:
在func1中我(假设)使用pandas加载了一个数据集:
def func1(a):
namesOfColumns = ["The sepal-length", "The sepal-width", "The petal-length", "The petal-width", "class"]
a = "some_file"
some_file = pd.read_csv(a)
return (some_file)
def func2(数据):
#code for printing the dimensions of the dataset
#code for printing the top 3 lines
#code for printing the mean and standard variation of the sepal-width
#code for plot box plot of each attribute
有人会解释我如何处理func2中的步骤吗?
答案 0 :(得分:1)
打印数据集维度的代码:
print(data.info()) # Descriptive info about the DataFrame
print(data.shape) # gives a tuple with the shape of DataFrame
打印前3行的代码:
print(data.head(3))
打印萼片宽度的平均值和标准变化:
print(data.describe()) # General statistics
print(data['Sepal_Width'].mean(), data['Sepal_Width'].std()) # Mean & std dev of Sepal_Width only
每个属性的绘图框图代码:
data.boxplot(namesOfColumns)