假设我有一个pandas数据帧df
userid subcategory timestamp smartexpenseid companyid
20648196 SmartExpense Declined 2016-03-06T16:44:55.702Z 11771712||91164585|||| 9797
43124398 SmartExpense Declined 2016-03-06T17:09:06.033Z 11111111|249178181?CARRT?266298850196|93461910|||| 63177
76764125 SmartExpense Declined 2016-03-06T19:44:19.078Z 137177|250155900?HOTEL?270593373724|92826286|||| 199412
我想将smartexpenseid列拆分为同一数据框中的单独列11111111 | 249178181?CARRT?266298850196 | 93461910 |||| - > “CctKey |?TRIPID SegType SegId | EreceiptId | PctKey | MeKey | RcKey | CapKey”
有人可以建议用Python做最好的方法吗?
答案 0 :(得分:1)
试试这个
(?<CctKey>\d+)\|(?<TripId>\d*)\??(?<SegType>[^?]*)\??(?<SegId>\d*)\|(?<EreceiptId>\d+)\|(?<PctKey>[^|]*)\|(?<MeKey>[^|]*)\|(?<RcKey>[^|]*)\|(?<CapKey>[^|\n\s]*)
删除Python中的所有组?<name>
语法
(\d+)\|(\d*)\??([^?]*)\??(\d*)\|(\d+)\|([^|]*)\|([^|]*)\|([^|]*)\|([^|\n\s]*)