我有2个csv文件。第一个看起来像这样:
Dim docSource as Word.Document
Dim docTarget as Word.Document
Dim tbl as Word.Table, cel as Word.Cell
Dim rngCell as Word.Range, rngTarget as Word.Range
Dim searchText as String, lenText as Long
Set docSource = Documents.Open("path to document with random numbers")
Set docTarget = Documents.Open("path to document to be searched")
Set tbl = docSource.Tables(1)
Set rngTarget = docTarget.Content
For Each cel in tbl.Range.Cells
searchText = cel.Range.Text
lenText = Len(searchText)
If lenText > 1 Then 'If cell is not "empty"
searchText = Mid(searchText, 1, lenText - 2) 'remove cell structures
With rngTarget.Find
.Replacement.ClearFormatting
.Text = searchText
.Replacement.Text = Selection.Characters
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
End If
Next
它仅包含学生的ID和他们已经做过的练习。 第二个包含每个ID的ID和等级:
ID , Exersice
1 , 1.1
1 , 1.2
3 , 1.4
.
.
所以如何从第二个文件映射到第一个文件是这样的:
ID , 1.1 , 1.2 ,1.3 ...
1 , 5 , 9 ,8 ...
3 , 4 , 10 ,6 ...
.
.
答案 0 :(得分:1)
答案 1 :(得分:0)
使用DataFrame.set_index
和DataFrame.stack
创建MultiIndex Series
,如有必要,将所有列都转换为浮点数,最后使用DataFrame.join
,而无需先转换为浮点数:
s = df2.set_index('ID').rename(columns=float).stack().rename('grade')
df = df1.join(s, on=['ID','Exersice'])
print (df)
ID Exersice grade
0 1 1.1 5.0
1 1 1.2 9.0
2 3 1.4 NaN
另一个类似的解决方案:
df3 = df2.melt('ID', var_name='Exersice', value_name='new')
df3['Exersice'] = df3['Exersice'].astype(float)
df = df1.merge(df3, on=['ID','Exersice'], how='left')
print (df)
ID Exersice new
0 1 1.1 5.0
1 1 1.2 9.0
2 3 1.4 NaN
答案 2 :(得分:0)
一种方法是通过映射第二个表中的值来将grades
列创建到第一个数据帧。
此处将第二张表的ID
列设置为索引,以简化映射。另外,第二个表的列值是字符串,因此在应用第一个表中的单元格值时,这些值将转换为字符串。
import pandas as pd
df_exercises = pd.read_csv("student_exercises.csv")
df_grades = pd.read_csv("student_grade.csv")
df_grades.set_index("ID", inplace=True)
df_exercises['grades'] = df_exercises.apply(lambda x: df_grades.loc[x.ID, str(x.Exersice)], axis=1)