我在python中创建了一个数据框,
import pandas as pd
d = {'col1': ["day1", "7:00", "8:00","9:00", "10:00", "11:00",
"day2", "7:00", "8:00","9:00", "10:00", "11:00",
"day3", "7:00", "8:00","9:00", "10:00", "11:00"],
'col2': [0, 4.1, 3, 3.5, 45.1, 16.9,
0, 6.5, 4, 9.8, 33.9, 19.8,
0, 6.9, 2.5, 7, 81.1, 13.8]}
df = pd.DataFrame(data=d)
print(df)
col1 col2
0 day1 0.0
1 7:00 4.1
2 8:00 3.0
3 9:00 3.5
4 10:00 45.1
5 11:00 16.9
6 day2 0.0
7 7:00 6.5
8 8:00 4.0
9 9:00 9.8
10 10:00 33.9
11 11:00 19.8
12 day3 0.0
13 7:00 6.9
14 8:00 2.5
15 9:00 7.0
16 10:00 81.1
17 11:00 13.8
我想将col1中的所有时间轴数据更改为天,例如
col1 col2
0 day1 0.0
1 day1 4.1
2 day1 3.0
3 day1 3.5
4 day1 45.1
5 day1 16.9
6 day2 0.0
7 day2 6.5
8 day2 4.0
9 day2 9.8
10 day2 33.9
11 day2 19.8
12 day3 0.0
13 day3 6.9
14 day3 2.5
15 day3 7.0
16 day3 81.1
17 day3 13.8
它只是一个示例数据集。所以我希望能有一个小问题来解决这个问题。就像我们有1000天的数据集一样..
答案 0 :(得分:3)
尝试丢弃时间戳和前向填充:
# Remove timestamps
discard_mask = ~df.col1.str.startswith('day')
df.loc[discard_mask, "col1"] = np.nan
# Forward fill
df.ffill()
# col1 col2
# 0 day1 0.0
# 1 day1 4.1
# 2 day1 3.0
# 3 day1 3.5
# 4 day1 45.1
# 5 day1 16.9
# 6 day2 0.0
# 7 day2 6.5
# 8 day2 4.0
# 9 day2 9.8
# 10 day2 33.9
# 11 day2 19.8
# 12 day3 0.0
# 13 day3 6.9
# 14 day3 2.5
# 15 day3 7.0
# 16 day3 81.1
# 17 day3 13.8
答案 1 :(得分:3)
df.col1=df.col1.where(df.col1.str.isalnum()).ffill()
df
Out[242]:
col1 col2
0 day1 0.0
1 day1 4.1
2 day1 3.0
3 day1 3.5
4 day1 45.1
5 day1 16.9
6 day2 0.0
7 day2 6.5
8 day2 4.0
9 day2 9.8
10 day2 33.9
11 day2 19.8
12 day3 0.0
13 day3 6.9
14 day3 2.5
15 day3 7.0
16 day3 81.1
17 day3 13.8