Pandas Dataframe意外删除列-*奇怪的行为

时间:2018-09-21 19:50:50

标签: python pandas dataframe

尝试创建数据框时遇到奇怪的行为。我有一份要转换为数据框的字典列表。但是,在创建过程中,无意中删除了两列。我不确定为什么会这样。

这是我的清单:

    data_income_stmt = [{'ticker': 'ADBE', 'FY': 2017, 'statement': 'income_statement', 'operatingrevenue': 7301505000.0, 'totalrevenue': 7301505000.0, 'operatingcostofrevenue': 1010491000.0, 'totalcostofrevenue': 1010491000.0, 'totalgrossprofit': 6291014000.0, 'sgaexpense': 624706000.0, 'marketingexpense': 2197592000.0, 'rdexpense': 1224059000.0, 'amortizationexpense': 76562000.0, 'totaloperatingexpenses': 4122919000.0, 'totaloperatingincome': 2168095000.0, 'totalinterestexpense': 74402000.0, 'totalinterestincome': 7553000.0, 'otherincome': 36395000.0, 'totalotherincome': -30454000.0, 'totalpretaxincome': 2137641000.0, 'incometaxexpense': 443687000.0, 'netincomecontinuing': 1693954000.0, 'netincome': 1693954000.0, 'netincometocommon': 1693954000.0, 'weightedavebasicsharesos': 493632000.0, 'basiceps': 3.43, 'weightedavedilutedsharesos': 501123000.0, 'dilutedeps': 3.38, 'weightedavebasicdilutedsharesos': 493900000.0, 'basicdilutedeps': 3.43}, {'ticker': 'ADBE', 'FY': 2016, 'statement': 'income_statement', 'operatingrevenue': 5854430000.0, 'totalrevenue': 5854430000.0, 'operatingcostofrevenue': 819908000.0, 'totalcostofrevenue': 819908000.0, 'totalgrossprofit': 5034522000.0, 'sgaexpense': 576202000.0, 'marketingexpense': 1910197000.0, 'rdexpense': 975987000.0, 'amortizationexpense': 78534000.0, 'totaloperatingexpenses': 3540920000.0, 'totaloperatingincome': 1493602000.0, 'totalinterestexpense': 70442000.0, 'totalinterestincome': -1570000.0, 'otherincome': 13548000.0, 'totalotherincome': -58464000.0, 'totalpretaxincome': 1435138000.0, 'incometaxexpense': 266356000.0, 'netincomecontinuing': 1168782000.0, 'netincome': 1168782000.0, 'netincometocommon': 1168782000.0, 'weightedavebasicsharesos': 498345000.0, 'basiceps': 2.35, 'weightedavedilutedsharesos': 504299000.0, 'dilutedeps': 2.32, 'weightedavebasicdilutedsharesos': 497400000.0, 'basicdilutedeps': 2.35}, {'ticker': 'ADBE', 'FY': 2015, 'statement': 'income_statement', 'operatingrevenue': 4795511000.0, 'totalrevenue': 4795511000.0, 'operatingcostofrevenue': 744317000.0, 'totalcostofrevenue': 744317000.0, 'totalgrossprofit': 4051194000.0, 'sgaexpense': 533478000.0, 'marketingexpense': 1683242000.0, 'rdexpense': 862730000.0, 'amortizationexpense': 68649000.0, 'totaloperatingexpenses': 3148099000.0, 'totaloperatingincome': 903095000.0, 'totalinterestexpense': 64184000.0, 'totalinterestincome': 961000.0, 'otherincome': 33909000.0, 'totalotherincome': -29314000.0, 'totalpretaxincome': 873781000.0, 'incometaxexpense': 244230000.0, 'netincomecontinuing': 629551000.0, 'netincome': 629551000.0, 'netincometocommon': 629551000.0, 'weightedavebasicsharesos': 498764000.0, 'basiceps': 1.26, 'weightedavedilutedsharesos': 507164000.0, 'dilutedeps': 1.24, 'weightedavebasicdilutedsharesos': 499600000.0, 'basicdilutedeps': 1.26}, {'ticker': 'AMZN', 'FY': 2017, 'statement': 'income_statement', 'operatingrevenue': 177866000000.0, 'totalrevenue': 177866000000.0, 'operatingcostofrevenue': 137183000000.0, 'totalcostofrevenue': 137183000000.0, 'totalgrossprofit': 40683000000.0, 'sgaexpense': 3888000000.0, 'marketingexpense': 10069000000.0, 'rdexpense': 22620000000.0, 'totaloperatingexpenses': 36577000000.0, 'totaloperatingincome': 4106000000.0, 'totalinterestexpense': 848000000.0, 'totalinterestincome': 202000000.0, 'otherincome': 346000000.0, 'totalotherincome': -300000000.0, 'totalpretaxincome': 3806000000.0, 'incometaxexpense': 769000000.0, 'othergains': -4000000.0, 'netincomecontinuing': 3033000000.0, 'netincome': 3033000000.0, 'netincometocommon': 3033000000.0, 'weightedavebasicsharesos': 480000000.0, 'basiceps': 6.32, 'weightedavedilutedsharesos': 493000000.0, 'dilutedeps': 6.15, 'weightedavebasicdilutedsharesos': 479900000.0, 'basicdilutedeps': 6.32}, {'ticker': 'AMZN', 'FY': 2016, 'statement': 'income_statement', 'operatingrevenue': 135987000000.0, 'totalrevenue': 135987000000.0, 'operatingcostofrevenue': 105884000000.0, 'totalcostofrevenue': 105884000000.0, 'totalgrossprofit': 30103000000.0, 'sgaexpense': 2599000000.0, 'marketingexpense': 7233000000.0, 'rdexpense': 16085000000.0, 'totaloperatingexpenses': 25917000000.0, 'totaloperatingincome': 4186000000.0, 'totalinterestexpense': 484000000.0, 'totalinterestincome': 100000000.0, 'otherincome': 90000000.0, 'totalotherincome': -294000000.0, 'totalpretaxincome': 3892000000.0, 'incometaxexpense': 1425000000.0, 'othergains': -96000000.0, 'netincomecontinuing': 2371000000.0, 'netincome': 2371000000.0, 'netincometocommon': 2371000000.0, 'weightedavebasicsharesos': 474000000.0, 'basiceps': 5.01, 'weightedavedilutedsharesos': 484000000.0, 'dilutedeps': 4.9, 'weightedavebasicdilutedsharesos': 473300000.0, 'basicdilutedeps': 5.01}, {'ticker': 'AMZN', 'FY': 2015, 'statement': 'income_statement', 'operatingrevenue': 107006000000.0, 'totalrevenue': 107006000000.0, 'operatingcostofrevenue': 85061000000.0, 'totalcostofrevenue': 85061000000.0, 'totalgrossprofit': 21945000000.0, 'sgaexpense': 1918000000.0, 'marketingexpense': 5254000000.0, 'rdexpense': 12540000000.0, 'totaloperatingexpenses': 19712000000.0, 'totaloperatingincome': 2233000000.0, 'totalinterestexpense': 459000000.0, 'totalinterestincome': 50000000.0, 'otherincome': -256000000.0, 'totalotherincome': -665000000.0, 'totalpretaxincome': 1568000000.0, 'incometaxexpense': 950000000.0, 'othergains': -22000000.0, 'netincomecontinuing': 596000000.0, 'netincome': 596000000.0, 'netincometocommon': 596000000.0, 'weightedavebasicsharesos': 467000000.0, 'basiceps': 1.28, 'weightedavedilutedsharesos': 477000000.0, 'dilutedeps': 1.25, 'weightedavebasicdilutedsharesos': 465600000.0, 'basicdilutedeps': 1.28}, {'ticker': 'BA', 'FY': 2017, 'statement': 'income_statement', 'operatingrevenue': 93392000000.0, 'totalrevenue': 93392000000.0, 'operatingcostofrevenue': 76066000000.0, 'totalcostofrevenue': 76066000000.0, 'totalgrossprofit': 17326000000.0, 'sgaexpense': 4094000000.0, 'rdexpense': 3179000000.0, 'otherspecialcharges': -21000000.0, 'totaloperatingexpenses': 7252000000.0, 'totaloperatingincome': 10074000000.0, 'totalinterestexpense': 360000000.0, 'totalinterestincome': 204000000.0, 'otherincome': 129000000.0, 'totalotherincome': -27000000.0, 'totalpretaxincome': 10047000000.0, 'incometaxexpense': 1850000000.0, 'netincomecontinuing': 8197000000.0, 'netincome': 8197000000.0, 'netincometocommon': 8197000000.0, 'weightedavebasicsharesos': 602500000.0, 'basiceps': 13.6, 'weightedavedilutedsharesos': 602500000.0, 'dilutedeps': 13.43, 'weightedavebasicdilutedsharesos': 602500000.0, 'basicdilutedeps': 13.6, 'cashdividendspershare': 5.97}, {'ticker': 'BA', 'FY': 2016, 'statement': 'income_statement', 'operatingrevenue': 94571000000.0, 'totalrevenue': 94571000000.0, 'operatingcostofrevenue': 80790000000.0, 'totalcostofrevenue': 80790000000.0, 'totalgrossprofit': 13781000000.0, 'sgaexpense': 3616000000.0, 'rdexpense': 4627000000.0, 'otherspecialcharges': 7000000.0, 'totaloperatingexpenses': 8250000000.0, 'totaloperatingincome': 5531000000.0, 'totalinterestexpense': 306000000.0, 'totalinterestincome': 303000000.0, 'otherincome': 40000000.0, 'totalotherincome': 37000000.0, 'totalpretaxincome': 5568000000.0, 'incometaxexpense': 673000000.0, 'netincomecontinuing': 4895000000.0, 'netincome': 4895000000.0, 'netincometocommon': 4895000000.0, 'weightedavebasicsharesos': 635500000.0, 'basiceps': 7.7, 'weightedavedilutedsharesos': 635500000.0, 'dilutedeps': 7.61, 'weightedavebasicdilutedsharesos': 635500000.0, 'basicdilutedeps': 7.7, 'cashdividendspershare': 4.69}, {'ticker': 'BA', 'FY': 2015, 'statement': 'income_statement', 'operatingrevenue': 96114000000.0, 'totalrevenue': 96114000000.0, 'operatingcostofrevenue': 82088000000.0, 'totalcostofrevenue': 82088000000.0, 'totalgrossprofit': 14026000000.0, 'sgaexpense': 3525000000.0, 'rdexpense': 3331000000.0, 'otherspecialcharges': 1000000.0, 'totaloperatingexpenses': 6857000000.0, 'totaloperatingincome': 7169000000.0, 'totalinterestexpense': 275000000.0, 'totalinterestincome': 274000000.0, 'otherincome': -13000000.0, 'totalotherincome': -14000000.0, 'totalpretaxincome': 7155000000.0, 'incometaxexpense': 1979000000.0, 'netincomecontinuing': 5176000000.0, 'netincome': 5176000000.0, 'netincometocommon': 5176000000.0, 'weightedavebasicsharesos': 686900000.0, 'basiceps': 7.52, 'weightedavedilutedsharesos': 686900000.0, 'dilutedeps': 7.44, 'weightedavebasicdilutedsharesos': 686900000.0, 'basicdilutedeps': 7.52, 'cashdividendspershare': 3.82}]

这是我用来转换数据框的代码:

df = pd.DataFrame(data_income_stmt)

结果是缺少两个列: ticker,语句

运行print(df.columns.values.tolist())

时的结果
['FY', 'amortizationexpense', 'basicdilutedeps', 'basiceps', 'cashdividendspershare', 'dilutedeps', 'incometaxexpense', 'marketingexpense', 'netincome', 'netincomecontinuing', 'netincometocommon', 'operatingcostofrevenue', 'operatingrevenue', 'othergains', 'otherincome', 'otherspecialcharges', 'rdexpense', 'sgaexpense', 'statement', 'ticker', 'totalcostofrevenue', 'totalgrossprofit', 'totalinterestexpense', 'totalinterestincome', 'totaloperatingexpenses', 'totaloperatingincome', 'totalotherincome', 'totalpretaxincome', 'totalrevenue', 'weightedavebasicdilutedsharesos', 'weightedavebasicsharesos', 'weightedavedilutedsharesos']

我不确定为什么要删除/删除列。

1 个答案:

答案 0 :(得分:1)

没有丢失的列。

您的数据框有32列

len((df.columns.values.tolist()))

如果您遍历列表,请收集所有键并将它们与数据框进行比较,这是相同的。

keys = [] 
for e, k in enumerate(data_income_stmt):
    keys.extend(k.keys()) 
    print ('row',e,' keys so far', len(set(keys)),
           'statement found in keys', 'statement' in k.keys(),
           'ticker found in keys', 'ticker' in k.keys())

print('compare columns to keys', set(df.columns.values.tolist()) == set(keys))

print('ticker found in keys', 'ticker' in keys)
print('ticker found in df', 'ticker' in df.columns)
print('statement found in keys', 'statement' in keys)
print('statement found in df', 'statement' in df.columns)

此打印

row 0  keys so far 29 statement found in keys True ticker found in keys True
row 1  keys so far 29 statement found in keys True ticker found in keys True
row 2  keys so far 29 statement found in keys True ticker found in keys True
row 3  keys so far 30 statement found in keys True ticker found in keys True
row 4  keys so far 30 statement found in keys True ticker found in keys True
row 5  keys so far 30 statement found in keys True ticker found in keys True
row 6  keys so far 32 statement found in keys True ticker found in keys True
row 7  keys so far 32 statement found in keys True ticker found in keys True
row 8  keys so far 32 statement found in keys True ticker found in keys True
compare columns to keys True
ticker found in keys True
ticker found in df True
statement found in keys True
statement found in df True

每个词典项都有29个键而不是32个键可能会让您感到困惑。但是声明股票代码在其中。