在python中使用带有lambda的.assign()方法

时间:2018-03-10 17:25:36

标签: python pandas dataframe lambda assign

我在Python中运行此代码:

 #Declaring these now for later use in the plots
 TOP_CAP_TITLE = 'Top 10 market capitalization'
 TOP_CAP_YLABEL = '% of total cap'

 # Selecting the first 10 rows and setting the index
 cap10 = cap.loc[:10, :].set_index('id')

 # Calculating market_cap_perc
 cap10 = cap10.assign(market_cap_perc =
      lambda x: (x.market_cap_usd / cap.market_cap_usd.sum()) * 100)
 # Plotting the barplot with the title defined above 
 ax = cap10.plot.bar(x= id, y= market_cap_perc)
 ax.set_title(TOP_CAP_TITLE)
 # Annotating the y axis with the label defined above
 ax.set_ylabel(TOP_CAP_YLABEL)

收到错误:

NameError Traceback (most recent call last) in ()
   10 lambda x: (x.market_cap_usd / cap.market_cap_usd.sum()) * 100) 
   11 # Plotting the barplot with the title defined above ---> 
   12 ax = cap10.plot.bar(x= id, y= market_cap_perc) 
   13 ax.set_title(TOP_CAP_TITLE) 
   14 # Annotating the y axis with the label defined above
NameError: name 'market_cap_perc' is not defined

来自DataCamp Project上Task4的代码探索比特币加密货币市场。 cap是包含列id的DataFrame(例如'比特币','涟漪')。另一列market_cap_usd(此列包括usd中加密货币市场的成本。例如,' 159640995719' - 比特币的market_cap_usd。 有完成此任务的说明:

1.选择前10个硬币,将索引设置为id,并将生成的DataFrame指定给cap10

2.使用assign()计算每个硬币的市值百分比,并再次将其分配给cap10

3.在标题为"排名前10位的市值"中绘制排名前10位的硬币market_cap_perc。并将其分配给ax

4.使用ax对象,使用"%总帽数"注释y轴。

我尝试在lambda之前定义market_cap_perc: market_cap_perc = 0 并得到一个错误:

KeyError                                  Traceback (most recent call last)

    2133             try:
->  2134                 return self._engine.get_loc(key)
    2135             except KeyError:

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4443)()

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4289)()

KeyError: 139887424317984

1 个答案:

答案 0 :(得分:0)

我得到了答案:

import pandas as pd
# Reading datasets/coinmarketcap_06122017.csv into pandas
dec6 = pd.read_csv('datasets/coinmarketcap_06122017.csv')

# Selecting the 'id' and the 'market_cap_usd' columns
market_cap_raw = dec6[['id','market_cap_usd']]
cap = market_cap_raw.query('market_cap_usd > 0')
#Declaring these now for later use in the plots
TOP_CAP_TITLE = 'Top 10 market capitalization'
TOP_CAP_YLABEL = '% of total cap'

# Selecting the first 10 rows and setting the index
cap10 = cap.head(10).set_index('id')

# Calculating market_cap_perc
cap10 = cap10.assign(market_cap_perc = lambda x:   (x.market_cap_usd/cap.market_cap_usd.sum())*100)

# Plotting the barplot with the title defined above 
ax = cap10.market_cap_perc.head(10).plot.bar(title=TOP_CAP_TITLE)

# Annotating the y axis with the label defined above
ax.set_ylabel(TOP_CAP_YLABEL)