将 csv 单元格中的数据拆分为 2 个单元格

时间:2021-05-10 00:38:51

标签: python

我有一个数据集,其中有一列组合了我在 csv 文件中查看的事件的日期和时间,我想将它们拆分为单独的列。

列值的格式为: 2016-12-31T21:57:40.910Z

所以 DateTTimeOZ

有什么办法可以在python中做到这一点吗?

2 个答案:

答案 0 :(得分:1)

如果你确实在使用熊猫,你可以像这样将它们分成多列:

df = pd.DataFrame({'datetime': ['2016-12-31T21:57:40.910Z', '2020-1-31T12:52:10.910Z']})

输入df

    datetime
0   2016-12-31T21:57:40.910Z
1   2020-1-31T12:52:10.910Z

转换为日期时间

df['datetime'] = pd.to_datetime(df['datetime'])

拆分为不同的列

df['date'] = df['datetime'].dt.date
df['time'] = df['datetime'].dt.time
df['time_string'] = df['datetime'].dt.strftime('%H:%M:%S')

输出

datetime                             date            time      time_string
2016-12-31 21:57:40.910000+00:00    2016-12-31  21:57:40.910000 21:57:40
2020-01-31 12:52:10.910000+00:00    2020-01-31  12:52:10.910000 12:52:10

如果您不使用熊猫,请告诉我们更多信息,以便我们进一步提供帮助

答案 1 :(得分:0)

使用标准库,您可以执行以下操作:

import csv
import datetime as dt

import io
content = """DATETIME;VALUE
2016-12-31T21:57:40.910Z;EEEEEEE
2018-01-29T21:57:40.910Z;FFFFFFF"""

# Replace `io.StringIO(content)` with `open('input/file/path', 'r')`
# Replace `io.StringIO()` with `open('output/file/path', 'w', newline='')`
with io.StringIO(content) as f_in, io.StringIO() as f_out:
    reader = csv.reader(f_in, delimiter=';')
    writer = csv.writer(f_out, delimiter=';', lineterminator='\n')

    # Create the new file header, spliting "DATETIME" into "DATE" and "TIME"
    # I don't know what is your datetime column name, but replace it as you wish
    header = next(reader)
    new_header = []
    for column_name in header:
        if column_name == 'DATETIME':
            new_header.append('DATE')
            new_header.append('TIME')
        else:
            new_header.append(column_name)

    writer.writerow(new_header)
    
    # Iterate over each row
    for row in reader:
        new_row = []
        for name, value in zip(header, row):
            if name == 'DATETIME':
                # Parse the datetime string ("2016-12-31T21:57:40.910Z") as a datetime object
                dto = dt.datetime.strptime(row[0], "%Y-%m-%dT%H:%M:%S.%fZ")
                # Split the datetime object into two strings: date and time
                new_row.append(dto.strftime("%Y-%m-%d"))
                new_row.append(dto.strftime("%H:%M:%S.%f"))

            else:
                new_row.append(value)

        writer.writerow(new_row)

    print(f_out.getvalue())
    # Output:
    # DATE;TIME;VALUE
    # 2016-12-31;21:57:40.910000;EEEEEEE
    # 2018-01-29;21:57:40.910000;FFFFFFF