我有一个csv,我正在阅读数据框。然后我使用一个系列来修改csv的特定列。此列包含日期和时间。我基本上想从列中删除时间。该列看起来像这样
0 7/28/2015 14:31
1 7/28/2015 8:13
2 7/28/2015 16:16
3 7/28/2015 16:18
4 7/27/2015 9:54
5 7/27/2015 9:52
我拆分列
s = df['Work Info Date'].str.split(' ')
0 [7/28/2015, 14:31]
1 [7/28/2015, 8:13]
2 [7/28/2015, 16:16]
3 [7/28/2015, 16:18]
4 [7/27/2015, 9:54]
5 [7/27/2015, 9:52]
当我尝试使用del来del时间元素时,它只删除索引
del s[1]
0 [7/28/2015, 14:31]
2 [7/28/2015, 16:16]
3 [7/28/2015, 16:18]
4 [7/27/2015, 9:54]
5 [7/27/2015, 9:52]
我的最终目标是从此列中删除时间并将其加入电子表格。
0 7/28/2015
1 7/28/2015
2 7/28/2015
3 7/28/2015
4 7/27/2015
5 7/27/2015
电子表格
Incident ID,Submitter,Time Spent,Work Info Date
INC000004294045,Bob,,7/28/2015 14:31
INC000004301664,Janice,,7/28/2015 8:13
INC000004301813,Robert,,7/28/2015 16:16
INC000004301813,Alex,,7/28/2015 16:18
代码:
import pandas as pd
import numpy as np
df = pd.read_csv('output2.csv', encoding = 'utf-8')
s = df['Work Info Date'].str.split(' ')
s.name = 'Work Info Date'
del s[1]
s
#del df['Work Info Date']
#df.join(s)
#time_report = pd.pivot_table(df, index=["Submitter", "Work Info Date"], values=["Time Spent"], aggfunc = [np.sum], fill_value=0
答案 0 :(得分:1)
您可以再次使用.str
获取矢量化访问权限以选择列:
>>> df["Work Info Date"].str.split()
0 [7/28/2015, 14:31]
1 [7/28/2015, 8:13]
2 [7/28/2015, 16:16]
3 [7/28/2015, 16:18]
dtype: object
>>> df["Work Info Date"].str.split().str[0]
0 7/28/2015
1 7/28/2015
2 7/28/2015
3 7/28/2015
dtype: object
>>> df["Just_the_Date"] = df["Work Info Date"].str.split().str[0]
>>> df
Incident ID Submitter Time Spent Work Info Date Just_the_Date
0 INC000004294045 Bob NaN 7/28/2015 14:31 7/28/2015
1 INC000004301664 Janice NaN 7/28/2015 8:13 7/28/2015
2 INC000004301813 Robert NaN 7/28/2015 16:16 7/28/2015
3 INC000004301813 Alex NaN 7/28/2015 16:18 7/28/2015
您可能希望将日期转换为日期列而不仅仅是字符串,但这取决于您。
答案 1 :(得分:0)
您可以使用Series.apply
,datetime.strptime()
和datetime.strftime()
首先将日期时间解析为日期时间对象,然后将其转换为所需格式的字符串。代码 -
df['Work Info Date'] = df['Work Info Date'].apply(lambda x: datetime.datetime.strptime(x,'%m/%d/%Y %H:%M').strftime('%m/%d/%Y'))
这样做的好处是可以将日期转换为您想要的任何格式。
示例/演示 -
In [3]: df = pd.read_csv('a.csv', encoding = 'utf-8')
In [4]: df
Out[4]:
Incident ID Submitter Time Spent Work Info Date
0 INC000004294045 Bob NaN 7/28/2015 14:31
1 INC000004301664 Janice NaN 7/28/2015 8:13
2 INC000004301813 Robert NaN 7/28/2015 16:16
3 INC000004301813 Alex NaN 7/28/2015 16:18
In [6]: import datetime
In [7]: df['Work Info Date'] = df['Work Info Date'].apply(lambda x: datetime.datetime.strptime(x,'%m/%d/%Y %H:%M').strftime('%m/%d/%Y'))
In [8]: df
Out[8]:
Incident ID Submitter Time Spent Work Info Date
0 INC000004294045 Bob NaN 07/28/2015
1 INC000004301664 Janice NaN 07/28/2015
2 INC000004301813 Robert NaN 07/28/2015
3 INC000004301813 Alex NaN 07/28/2015