我有两个熊猫数据帧。
DataFrame 1
Index_Col Col1 Col2 Col3 Col4 Col5
Row1 0.64 0.89 0.76 0.22 1.34
Row2 0.54 0.56 0.82 0.46 0.23
and so on.
DataFrame 2具有dataframe1中每个列的阈值作为范围。
DataFrame 2
Column_Name Group Min Max
col1 G1 0.5 1
col2 G1 0.1 2
col3 G2 0.3 0.9
col4 G1 0.3 1
col5 G2 0.7 2
and so on
我正在尝试为DataFrame1的每一列中的每个值计算value = ((value - Min)/(Max - Min))*100
。例如,Col1的Row1的值为
((0.64-0.5)/(1-0.5))*100
。
我尝试将所有内容转换为列表,并使用多个for循环进行计算。但我想知道是否有任何更简单的方法。
答案 0 :(得分:0)
import pandas as pd
import io
# SAmple Data
df1 = pd.read_table(io.StringIO("""
Index_Col Col1 Col2 Col3 Col4 Col5
Row1 0.64 0.89 0.76 0.22 1.34
Row2 0.54 0.56 0.82 0.46 0.23
"""),delim_whitespace=True)
df2 = pd.read_table(io.StringIO("""
Column_Name Group Min Max
col1 G1 0.5 1
col2 G1 0.1 2
col3 G2 0.3 0.9
col4 G1 0.3 1
col5 G2 0.7 2
"""), delim_whitespace=True)
# Melt the wide data frame so that each cell is a row
df1m = pd.melt(df1, id_vars=["Index_Col"], var_name="Col")
# Lowercase the column name to match with df2
df1m['Column_Name'] = df1m['Col'].str.lower()
# Join the melted dataframe with the thresholds in df2
df1mj = df1m.merge(df2, left_on="Column_Name", right_on="Column_Name")
# Calculate
df1mj['new_value'] = ((df1mj['value'] - df1mj['Min'])/(df1mj['Max'] - df1mj['Min']))*100
# Use pivot to reassemble the wide dataframe
result = df1mj.pivot(index = "Index_Col", columns="Col", values="new_value")
结果:
Col Col1 Col2 Col3 Col4 Col5
Index_Col
Row1 28.0 41.578947 76.666667 -11.428571 49.230769
Row2 8.0 24.210526 86.666667 22.857143 -36.153846