您知道如何在python中解决此问题吗?我想有一个数据框,其中的数据排列在正确的列中。
谢谢!
这是来自数据帧的字符串的示例。
'Huidigefuncties迈克尔·乔丹(Michael Jordan)2015年-现任理光荷兰分部市场与间接渠道总监哈佛'
首选结果
type from to function organization
current 2015 present Director Marketing & Indirect Channels Ricoh Nederland
current 2010 present Owner & Consultant Basketball Center
old 2012 2015 Director Marketing & Business Development Ricoh
school 1988 1992 Marketing Harvard
当前df
Name Data
Michael Jordan ' Huidigefuncties Michael Jordan 2015 - present Director Marketing & Indirect Channels, Ricoh Nederland 2010 - present Basketball Center, Center for Business-Expertise Loopbaan Michael Jordan 2012 - 2015 Director Marketing & Business Development, Ricoh Opleiding Michael Jordan 1988 - 1992 Marketing , Harvard '
答案 0 :(得分:0)
好吧,这是我针对此问题所做的解决方案
import pandas as pd
beautiful_data = 'Huidigefuncties Michael Jordan 2015 - present Director Marketing & Indirect Channels, Ricoh Nederland 2010 - present Basketball Center, Center for Business-Expertise Loopbaan Michael Jordan 2012 - 2015 Director Marketing & Business Development, Ricoh Opleiding Michael Jordan 1988 - 1992 Marketing , Harvard'
main_dict = {'type':[], 'from':[], 'to':[], 'function':[], 'organization': []}
data = beautiful_data.split(' ')
i = 0
huidi_index = data.index('Huidigefuncties')
loopbaan_index = data.index('Loopbaan')
ople_index = data.index('Opleiding')
# print(data)
while i < len(data):
if data[i] == 'Huidigefuncties':
line = ' '.join(data[i + 1: loopbaan_index])
i = loopbaan_index
print(line)
type_data = 'current'
elif data[i] == 'Loopbaan':
line = ' '.join(data[i + 1: ople_index])
i = ople_index
print(line)
type_data = 'old'
elif data[i] == 'Opleiding':
line = ' '.join(data[i+1: ])
i = len(data)
print(line)
type_data = 'school'
else:
i += 1
data_line = line.split('-')
if len(data_line) == 2:
print(type_data)
main_dict['type'].append(type_data)
from_data = data_line[0].strip().split(' ')[-1]
print(from_data)
main_dict['from'].append(from_data)
to_data = data_line[1].strip().split(' ')[0]
print(to_data)
main_dict['to'].append(to_data)
function_data = ' '.join(data_line[1].strip().split(' ')[1:-1])[:-1]
print(function_data)
main_dict['function'].append(function_data)
organization_data = data_line[1].split(',')[-1].strip()
print(organization_data)
main_dict['organization'].append(organization_data)
elif len(data_line) > 2:
j = 0
while j < len(data_line):
register_data = data_line[j:j+2]
if len(register_data) > 1:
if len(register_data[0].split(' ')) > 1 and len(register_data[1].split(' ')) > 1:
if j == 0:
print(register_data)
print('----------')
print(type_data)
main_dict['type'].append(type_data)
from_data = register_data[0].strip().split(' ')[-1]
print(from_data)
main_dict['from'].append(from_data)
to_data = register_data[1].strip().split(' ')[0]
print(to_data)
main_dict['to'].append(to_data)
function_org = register_data[1].strip().split(',')
function_data = ' '.join(function_org[0].split(' ')[1:])
print(function_data)
main_dict['function'].append(function_data)
org_data = ' '.join(function_org[1].split(' ')[:-1]).strip()
print(org_data)
main_dict['organization'].append(org_data)
print('-----------')
else:
print('-----------')
print(register_data)
print(type_data)
main_dict['type'].append(type_data)
from_data = register_data[0].strip().split(' ')[-1]
print(from_data)
main_dict['from'].append(from_data)
to_data = register_data[1].strip().split(' ')[0]
print(to_data)
main_dict['to'].append(to_data)
function_org = register_data[1].strip().split(',')
function_data = ' '.join(function_org[0].split(' ')[1:])
print(function_data)
main_dict['function'].append(function_data)
org_data = ' '.join(function_org[1].split(' ')).strip()
print(org_data)
main_dict['organization'].append(org_data)
print('-----------')
j += 1
df = pd.DataFrame(main_dict)
经过测试