整数表示

Question

我在Jupyter Notebook中展示了一个数据框。数据框的初始数据类型是float。我想提出第1行和第1行。 3个打印的表作为整数和行2＆amp; 4为百分比。我怎么做？（我花了很多时间寻找一个没有成功的解决方案）

这是我正在使用的代码：

#Creating the table
clms = sales.columns
indx = ['# of Poeple','% of Poeple','# Purchased per Activity','% Purchased per Activity']
basic_stats = pd.DataFrame(data=np.nan,index=indx,columns=clms)
basic_stats.head()

#Calculating the # of people who took part in each activity
for clm in sales.columns:
    basic_stats.iloc[0][clm] = int(round(sales[sales[clm]>0][clm].count(),0))

#Calculating the % of people who took part in each activity from the total email list
for clm in sales.columns:
    basic_stats.iloc[1][clm] = round((basic_stats.iloc[0][clm] / sales['Sales'].count())*100,2)

#Calculating the # of people who took part in each activity AND that bought the product
for clm in sales.columns:
    basic_stats.iloc[2][clm] = int(round(sales[(sales[clm] >0) & (sales['Sales']>0)][clm].count()))

#Calculating the % of people who took part in each activity AND that bought the product
for clm in sales.columns:
    basic_stats.iloc[3][clm] = round((basic_stats.iloc[2][clm] / basic_stats.iloc[0][clm])*100,2)

#Present the table
basic_stats

这是印刷的表格： Output table of 'basic_stats' data frame in Jupyter Notebook

Answer 1

整数表示

您已经将整数分配给第1行和第3行的单元格。这些整数打印为浮点数的原因是所有列都具有数据类型float64。这是由您最初创建Dataframe的方式引起的。您可以通过打印.dtypes属性来查看数据类型：

basic_stats = pd.DataFrame(data=np.nan,index=indx,columns=clms)
print(basic_stats.dtypes)

# Prints:
# column1    float64
# column2    float64
# ...
# dtype: object

如果未在Data的构造函数中提供data关键字参数 frame，每个单元格的数据类型为object，可以是任何对象：

basic_stats = pd.DataFrame(index=indx,columns=clms)
print(basic_stats.dtypes)

# Prints:
# column1    object
# column2    object
# ...
# dtype: object

当单元格的数据类型为object时，内容将使用内置方法进行格式化，从而使整数格式正确。

百分比表示

为了显示百分比，您可以使用以您希望的方式打印浮点数的自定义类：

class PercentRepr(object):
    """Represents a floating point number as percent"""
    def __init__(self, float_value):
        self.value = float_value
    def __str__(self):
        return "{:.2f}%".format(self.value*100)

然后只使用此类作为第1行和第3行的值：

#Creating the table
clms = sales.columns
indx = ['# of Poeple','% of Poeple','# Purchased per Activity','% Purchased per Activity']
basic_stats = pd.DataFrame(index=indx,columns=clms)
basic_stats.head()

#Calculating the # of people who took part in each activity
for clm in sales.columns:
    basic_stats.iloc[0][clm] = int(round(sales[sales[clm]>0][clm].count(),0))

#Calculating the % of people who took part in each activity from the total email list
for clm in sales.columns:
    basic_stats.iloc[1][clm] = PercentRepr(basic_stats.iloc[0][clm] / sales['Sales'].count())

#Calculating the # of people who took part in each activity AND that bought the product
for clm in sales.columns:
    basic_stats.iloc[2][clm] = int(round(sales[(sales[clm] >0) & (sales['Sales']>0)][clm].count()))

#Calculating the % of people who took part in each activity AND that bought the product
for clm in sales.columns:
    basic_stats.iloc[3][clm] = PercentRepr(basic_stats.iloc[2][clm] / basic_stats.iloc[0][clm])

#Present the table
basic_stats

注意：这实际上会更改数据框中的数据！如果要使用第1行和第3行的数据进行进一步处理，您应该知道这些行不再包含浮动对象。

Answer 2

这是一种方式，一种黑客，但如果只是为了漂亮的打印，它会起作用。

df = pd.DataFrame(np.random.random(20).reshape(4,5))

# first and third rows display as integers
df.loc[0,] = df.loc[0,]*100 
df.loc[2,] = df.loc[2,]*100

df.loc[0,:] = df.loc[0,:].astype(int).astype(str)
df.loc[2,:] = df.loc[2,:].astype(int).astype(str)

# second and fourth rows display as percents (with 2 decimals)
df.loc[1,:] = np.round(df.loc[1,:].values.astype(float),4).astype(float)*100
df.loc[3,:] = np.round(df.loc[3,:].values.astype(float),4).astype(float)*100

格式化Jupyter Notebook数据帧输出中的特定行

2 个答案:

整数表示

百分比表示