我有一个数据集,如下所示
debugger
var app = Application.currentApplication();
app.includeStandardAdditions = true;
var source = "/documents/John's Spreadsheet.xls";
var target = "/documents/John's Spreadsheet.csv";
source = source.replace("'", "'\"'\"'", "g");
target = target.replace("'", "'\"'\"'", "g");
var exportScript = "/excel -i";
exportScript += " '" + source + "'";
exportScript += " '" + target + "'";
exportScript += ";true";
try {
app.doShellScript(exportScript);
}
catch (error) {
console.log(error.message);
}
我正在尝试解析此数据,并在出现特定分隔符时将其拆分为新行。这些分隔符是'〜'。即将事件和日期列拆分为新列名称为Eventsplit和day split
我已经完成了以下代码但是我不知道如何一次性完成这两个列可以任何身体帮助。
这是方法号。 1我试过
Cus_ID Event Day
1 Event1~Event2~Event3~Event4 1~1~1~1
2 Event3~Event4~Event5~Event6 1~2~3~4
the output i'm trying to get would be:
Cus_ID | Event | Day | EventSplit|Day split
----------------------------------------------------------------------------
1 | Event1~Event2~Event3~Event4| 1~1~1~1 | Event1 |1
1 | Event1~Event2~Event3~Event4| 1~1~1~1 | Event2 |1
1 | Event1~Event2~Event3~Event4| 1~1~1~1 | Event3 |1
1 | Event1~Event2~Event3~Event4| 1~1~1~1 | Event4 |1
2 | Event3~Event4~Event5~Event6| 1~2~3~4 | Event3 |1
2 | Event3~Event4~Event5~Event6| 1~2~3~4 | Event4 |2
2 | Event3~Event4~Event5~Event6| 1~2~3~4 | Event5 |3
2 | Event3~Event4~Event5~Event6| 1~2~3~4 | Event6 |4
import pandas as pd
import numpy as np
data =pd.read_csv("SeqData.csv")
def pre(data, c):
event_col = data[c].str.split('~')
clst = event.values.tolist()
lens = [len(l) for l in clst]
EventSplit = pd.DataFrame({c: np.concatenate(clst)}, data.index.repeat(lens))
return data.drop(c, 1).join(EventSplit ).reset_index(drop=True)
Data_df = pre(data, 'Event')
答案 0 :(得分:0)
这是不必要的,但有点不同,因为你需要同时不需要两列
df1=df.copy()
df.Event=df.Event.str.split('~')
df.Day=df.Day.str.split('~')
Tdf=pd.DataFrame({'Cus_ID':df['Cus_ID'].repeat(df.Event.str.len()),'EventSplit':np.concatenate(df.Event.tolist()),'DaySplit':np.concatenate(df.Day.tolist()),}).merge(df1,on='Cus_ID')
Tdf
Out[661]:
Cus_ID DaySplit EventSplit Event Day
0 1 1 Event1 Event1~Event2~Event3~Event4 1~1~1~1
1 1 1 Event2 Event1~Event2~Event3~Event4 1~1~1~1
2 1 1 Event3 Event1~Event2~Event3~Event4 1~1~1~1
3 1 1 Event4 Event1~Event2~Event3~Event4 1~1~1~1
4 2 1 Event3 Event3~Event4~Event5~Event6 1~2~3~4
5 2 2 Event4 Event3~Event4~Event5~Event6 1~2~3~4
6 2 3 Event5 Event3~Event4~Event5~Event6 1~2~3~4
7 2 4 Event6 Event3~Event4~Event5~Event6 1~2~3~4