我再次希望找到解决我的编码噩梦的方法。我有一个字典term_dict
,其中术语列表为键,术语类别为值。以及包含ID和Notes列的数据框data
。任务是使用data.Notes
为每term_dict
条记录在data.ID
中找到匹配的字词。
term_dict{
Ibuprofen 800mg : Drug
Hip Replacement Surgery : Treatment
Tylenol AM : Drug
Mild Dislocation : Treatment
Advil : Drug
Fractured Tibia : Treatment
Quinone : Drug
Fever : Treatment
Penicillin 250mg : Drug
Histerectomy : Treatment
Surgical removal of bunion : Treatment
Therapy : Treatment
Bunion : Treatment
Hospita X : Location
mg : Dosage
stop : Exclusion
}
data:
ID Notes
604 Take 2 tablets of advil & 3 caps of pen
250mg twice daily
602 Stop pen but cont. with advil
as needed for the fracture
210 2 tabs of Tyl 3x daily for 5 days
607 nan
700 surgery scheduled for 01/01/2017
515 nan
019 Call my office if bunion pain persist
after 3 days
604 f/up appt. @Hospital X
到目前为止,这是我的代码:
lists = []
for s in data['Notes']:
cleanNotes = " " + " ".join(re.split(r'[^a-z 0-9]|[w/]',s.lower())) + " "
for k, v in term_dict.items():
k = " %s "%k
if k in cleanNotes and v != exclusion:
if k in cleanNotes and v == 'drug':
lists.append(k)
data['Drug'] = ':'.join(str(lists))
elif k in cleanNotes and v == 'location':
lists.append(k)
data['Location'] = ' '.join(str(lists))
elif k in cleanNotes and v == 'treatment':
lists.append(k)
data['Treatment'] = ':'.join(str(lists))
elif k in cleanNotes and v == 'dosage':
lists.append(k)
data['Dosage'] = ':'.join(str(lists))
else:
for s in data.Notes:
matches = list(datefinder.find_dates(s.lower()))
data['Date'] = ', '.join([str(dates) for dates in matches])
....而且我的输出不是预期的,因为代码只是使用来自数据帧最后一条记录的匹配来填充数据框的新列:
data:
ID Notes Drug Dosage Location Treatment Date
604 Take 2 tablets of advil & 3 caps of pen advil Hospital X
250mg twice daily
602 Stop pen but cont. with advil advil Hospital X
as needed for the fracture
210 2 tabs of Tyl 3x daily for 5 days advil
607 nan advil
700 surgery scheduled for 01/01/2017 advil
515 nan advil
019 Call my office if bunion pain persist advil
after 3 days
604 f/up appt. @Hospital X. cont w/advil advil Hospital X
***但预期输出:
data:
ID Notes Drug Dosage Location Treatment Date
604 Take 2 tablets of advil & 3 caps of pen advil:penicilin 0:250mg
250mg twice daily
602 Stop pen but cont. with advil advil fracture
as needed for the fracture
210 2 tabs of Tyl 3x daily for 5 days Tylenol
607 nan
700 surgery scheduled for 01/01/2017 surgery 01/01/2017
515 nan
019 Call my office if bunion pain persist bunion
after 3 days
604 f/up appt. @Hospital X. cont w/advil advil Hospital X
如果我能修复这个重复,我将不胜感激。谢谢!
答案 0 :(得分:0)
你错误的本质是这个。您将该列的每个元素分配给相同的值:
$model = new Translation();
if ($model->load(Yii::$app->request->post()) && $model->save()) {
$getfromtitle = Yii::$app->request->post('Translation')['translation_title'];
echo $getfromtitle;
echo "<br />";
$model->translation_drive_title = $getfromtitle;
echo $model->translation_drive_title;
echo "<br />";
echo "here";
die();
}
要解决此问题,您可以做的是预先创建空列,如下所示:
In [114]: import pandas as pd
In [115]: df = pd.DataFrame(np.random.randn(50, 4), columns=list('ABCD'))
In [116]: df.head()
Out[116]:
A B C D
0 -0.896291 -0.277551 0.926559 0.522212
1 -0.265559 -1.300435 -0.079514 -1.083569
2 -0.534509 0.298264 -1.361829 0.750666
3 0.318937 -0.407164 0.080020 0.499435
4 -0.161574 -1.012471 0.631092 1.368540
In [117]: df['NewCol'] = 'something here'
In [119]: df.head()
Out[119]:
A B C D NewCol
0 -0.896291 -0.277551 0.926559 0.522212 something here
1 -0.265559 -1.300435 -0.079514 -1.083569 something here
2 -0.534509 0.298264 -1.361829 0.750666 something here
3 0.318937 -0.407164 0.080020 0.499435 something here
4 -0.161574 -1.012471 0.631092 1.368540 something here
循环注释时,使用枚举循环:
In [120]: df = pd.DataFrame(np.random.randn(50, 1), columns=['Notes'])
In [121]: df['Drug'] = ""
...: df['Location'] = ""
...: df['Treatment'] = ""
...: df['Dosage'] = ""
...:
In [122]: df.head()
Out[122]:
Notes Drug Location Treatment Dosage
0 0.325993
1 -0.561066
2 0.555040
3 0.001332
4 0.400009
然后,当需要时,只需适当地设置该单元格:
for i, s in enumerate(data['Notes']):