当我尝试从CSV文件加载数据时,以下代码可以正常工作,但文本文件的错误如下:
TypeError :(“无法使用索引号[4]的切片索引进行索引,'发生在索引1')
这是我的代码:
import pandas as pd
#Reading Input File and converting into dataframe
input_data=pd.read_csv("C:\\Users\\hp\\Desktop\\py\\Input_Data.txt",delimiter =',')
input_data_df=pd.DataFrame(input_data)
#Reading Reference File and converting into dataframe
reference_data=pd.read_csv("C:\\Users\\hp\\Desktop\\py\\Reference_Data.txt",delimiter =',')
reference_data_df=pd.DataFrame(reference_data)
#Merging files based on unique Columns
Input_Reference_merge= pd.merge(input_data_df, reference_data_df, on=['emp_id', 'emp_name'])
print(Input_Reference_merge)
# Get the index where jan starts
months_index_start = input_data_df.columns.get_loc("jan")
# Calculate the total salary for each row according to the months_worked column
Input_Reference_merge["total_sal"] = Input_Reference_merge.apply(lambda x : x[months_index_start : months_index_start + x["months_worked"]].sum(), axis = 1)
print(Input_Reference_merge)
下面是我正在使用的数据集。
文件输入数据:
emp_id emp_name months_worked total_sal jan feb mar apr may \
0 1 aaa 4 NaN 2000 1 2.0 3 4.0
1 2 bbb 3 NaN 1 2 NaN 4 5.0
2 3 bbb 7 NaN 1 2 34343.0 4 NaN
3 4 bbb 12 NaN 1 2 33434.0 4 5.0
jun jul aug sep oct nov dec
0 5555.0 NaN 74343.0 8 9 10.0 NaN
1 643.0 7.0 NaN 9343 10 13431.0 12.0
2 6343.0 7.0 NaN 9 1043 11.0 12.0
3 NaN 7.0 8.0 9 1 NaN 12.0
文件参考数据:
emp_id emp_name months_worked
0 1 aaa 4
1 2 bbb 3
2 3 bbb 7
3 4 bbb 12