我正在寻求帮助,以找到有关如何通过应用一些条件向Pandas df添加新列的良好逻辑。
将根据某些条件创建“ O”列(最小播放):
示例:
播放器“ Antonio M。”,列“ N”(替代) em> => min_played =时间(94.54 = 94.54)
列“ J”(名称)中的玩家“ Bowen J。”被“ Anderson F。”替换 “ Bowen J。” min_played = 89 << em>取自“ N”列的值(subs)> “ 安德森F。” min_played = 94.54 << em>值取自“ G”列(时间)>减去89 << em>值取自“ N”列(替换) > =>总分钟数= 5.54 并且此值应添加到 第13行“ min_played”列
为什么要在第13行:因为他的名字在那儿<< em> 第13行“ J”(名字) >
对于每轮<< em>“ B”列(比赛类型)>,我必须执行此过程
# Convert End Time to float
def convert_to_float(x):
remove_char = lambda x: x.replace(' ','').replace(':','.')
temp_list = remove_char(x).split('+')
return sum([float(i) for i in temp_list])
df['time'] = df['time'].apply(convert_to_float)
# Convert Sub-Out Time to Float
def min_played(x):
try:
min_played = x.split(" ")[0].replace("'","")
return convert_to_float(min_played)
except:
pass
df['min_played'] = df['subs'].apply(min_played)
indx = 0
for x in df['status']:
if (x == 'line-up') & (df.loc[indx, 'subs'] is np.nan) == True:
df.loc[indx,'min_played'] = df.loc[indx, 'time']
if (x != 'line-up') & (x != 'sub') == True:
df.loc[indx,'min_played'] = 0
indx += 1
filtr = (df['status'] == 'line-up')
df.loc[filtr, 'sub_min_played'] = df.loc[filtr, 'time'] - df.loc[filtr, 'min_played']
filtr = (df['status'] != 'line-up') & (df['status'] != 'sub')
df.loc[filtr, 'sub_min_played'] = 0
df['name'] = df['name'].apply(lambda x: x.replace(" (C)",""))
df.to_csv('q.csv')
答案 0 :(得分:0)
根据使用情况,完整数据中可能需要处理的边缘情况很少。但这应该是一个好的起点。
df = pd.read_csv('SOSample.csv')
def convert_to_float(x):
remove_char = lambda x: x.replace(' ','').replace(':','.')
temp_list = remove_char(x).split('+')
return sum([float(i) for i in temp_list])
df['time'] = df['time'].apply(convert_to_float)
def min_played(time,subs,status):
if status == 'line-up':
if isinstance(subs,str):
t = subs.split("'")[0]
#eval to handle cases like `(90+3)`
# eval("90+3") = 93
return eval(t)
else:
return time
return np.nan
def sub_min_played(time,status,min_played):
if time != min_played:
return time-min_played
df['min_played'] = df.apply(lambda x: min_played(x.time,x.subs,x.status),axis=1)
df['sub_min_played'] = df.apply(lambda x: sub_min_played(x.time,x.status,x.min_played),axis=1)
df