Column1. Column2
Start1 633
End. 855
Start2. 767
Start3. 231
End. 545
Start4. 111
Start5 243
End. 333
输出
Column1. Column2
Start1 633
End. 855
Start3. 231
End. 545
Start5 243
End. 333
在column1 Start2行拖放中,因为它与start4结尾不同
答案 0 :(得分:0)
您可以像这样使用select t.*
from t
where t.pricedate = (select max(t2.pricedate)
from t as t2
where t2.item = t.item
);
和groupby:
SELECT t1.Item, t1.Price, t1.PriceDate
FROM yourTable t1
WHERE t1.PriceDate = (SELECT MAX(t2.PriceDate) FROM yourTable t2 WHERE t2.Item = t1.Item);
输入数据框:
cumsum
使用cumsum和groupby:
df = pd.DataFrame({'Column1':['Start1','End.','Start2','Start3','End.','Start4','Start5','End.'],
'Column2':[633,855,767,231,545,111,243,333]})
输出:
Column1 Column2
0 Start1 633
1 End. 855
2 Start2 767
3 Start3 231
4 End. 545
5 Start4 111
6 Start5 243
7 End. 333
答案 1 :(得分:0)
它比以前的答案更长,但我相信它更容易理解:
In [1]:
import pandas as pd
## Create the Dataframe
cols = ['Column1', 'Column2']
data = [['Start1', 633],['End', 855],['Start2', 767],['Start3', 231],
['End', 545],['Start4', 111],['Start5', 243],['End', 333]]
df = pd.DataFrame(data=data, columns=cols)
df
Out [1]:
Column1 Column2
0 Start1 633
1 End 855
2 Start2 767
3 Start3 231
4 End 545
5 Start4 111
6 Start5 243
7 End 333
我在这里循环查看行,如果后面的行也包含Start
,我将删除该行。
In [2]:
idx = df.index.tolist()
droped_idx = []
for i in idx[:-1]:
row = df.iloc[i, 0]
next_row = df.iloc[i+1, 0]
if (('Start' in row) & ('Start' in next_row)):
droped_idx.append(i)
df.drop(index=droped_idx, inplace=True)
df
Out [2]:
Column1 Column2
0 Start1 633
1 End 855
3 Start3 231
4 End 545
6 Start5 243
7 End 333