我正在通过在现有列中拆分变量来创建新列
Category_list是一个现有的列,具有公司所属的一个或多个类别,其中多个类别之间用竖线(|)分隔。
我试图通过假设第一个(如果有多个类别)应该是主要类别来确定公司所属的主要类别。
master_frame_ven['primary_sector'] = master_frame_ven['category_list'].str.split(pat = '|')
执行此操作并执行上面的代码后,拆分的对象存储在primary_category列的列表中。
要从列表中删除它们,将它们存储为单独的实体并仅提取第一个值,我试图这样做,
master_frame_ven['primary_sector'] = master_frame_ven['primary_sector'].apply(lambda x: x[0])
执行此操作时,它会引发错误:
> TypeError Traceback (most recent call
> last) <ipython-input-66-5dffe0256421> in <module>
> 1 master_frame_ven['primary_sector'] = master_frame_ven['category_list'].str.split(pat = '|')
> 2
> ----> 3 master_frame_ven['primary_sector'] = master_frame_ven['primary_sector'].apply(lambda x: x[0])
>
> TypeError: 'float' object is not subscriptable
这是我当前正在使用的数据框。
> {'company_permalink': {0: '/organization/-fame', 1:
> '/organization/-qounter', 3: '/organization/-the-one-of-them-inc-',
> 4: '/organization/0-6-com', 5: '/organization/004-technologies'},
> 'funding_round_permalink': {0:
> '/funding-round/9a01d05418af9f794eebff7ace91f638', 1:
> '/funding-round/22dacff496eb7acb2b901dec1dfe5633', 3:
> '/funding-round/650b8f704416801069bb178a1418776b', 4:
> '/funding-round/5727accaeaa57461bd22a9bdd945382d', 5:
> '/funding-round/1278dd4e6a37fa4b7d7e06c21b3c1830'},
> 'funding_round_type': {0: 'venture', 1: 'venture', 3: 'venture',
> 4: 'venture', 5: 'venture'}, 'funding_round_code': {0: 'B', 1: 'A',
> 3: 'B', 4: 'A', 5: nan}, 'funded_at': {0: '05-01-2015', 1:
> '14-10-2014', 3: '30-01-2014', 4: '19-03-2008', 5:
> '24-07-2014'}, 'raised_amount_usd': {0: 10000000.0, 1: nan, 3:
> 3406878.0, 4: 2000000.0, 5: nan}, 'permalink': {0: '/organization/-fame', 1: '/organization/-qounter', 3:
> '/organization/-the-one-of-them-inc-', 4: '/organization/0-6-com',
> 5: '/organization/004-technologies'}, 'name': {0: '#fame', 1:
> ':Qounter', 3: '(THE) ONE of THEM,Inc.', 4: '0-6.com', 5: '004
> Technologies'}, 'homepage_url': {0: 'http://livfame.com', 1:
> 'http://www.qounter.com', 3: 'http://oneofthem.jp', 4:
> 'http://www.0-6.com', 5: 'http://004gmbh.de/en/004-interact'},
> 'category_list': {0: 'Media', 1: 'Application Platforms|Real
> Time|Social Network Media', 3: 'Apps|Games|Mobile', 4: 'Curated
> Web', 5: 'Software'}, 'status': {0: 'operating', 1: 'operating',
> 3: 'operating', 4: 'operating', 5: 'operating'}, 'country_code':
> {0: 'IND', 1: 'USA', 3: nan, 4: 'CHN', 5: 'USA'}, 'state_code': {0:
> '16', 1: 'DE', 3: nan, 4: '22', 5: 'IL'}, 'region': {0: 'Mumbai',
> 1: 'DE - Other', 3: nan, 4: 'Beijing', 5: 'Springfield,
> Illinois'}, 'city': {0: 'Mumbai', 1: 'Delaware City', 3: nan,
> 4: 'Beijing', 5: 'Champaign'}, 'founded_at': {0: nan, 1:
> '04-09-2014', 3: nan, 4: '01-01-2007', 5: '01-01-2010'},
> 'primary_sector': {0: 'Media', 1: 'Application Platforms', 3:
> 'Apps', 4: 'Curated Web', 5: 'Software'}}
我不知道它的哪一部分是浮点数,以及为什么我精确地得到了这个错误。我该怎么做才能从列表中仅提取主要类别并将其存储为字符串,而不是将其保留为列表?
答案 0 :(得分:0)