如何在熊猫列字符串中插入空格

时间:2020-01-16 03:54:00

标签: python pandas

此列中的日期/时间信息不正确:

import pandas as pd
df = pd.DataFrame({
    'date': ['1/25/201612:00:00AM','2/25/201712:00:00AM','3/25/201812:00:00AM',
             '4/25/201912:00:00AM','5/25/201912:00:00AM','6/25/201912:00:00AM']})

我在下面尝试了此功能,但是会产生一列NaN:

def insert_space(string, integer):
    return string[0:integer] + ' ' + string[integer:]
insert_space(df['date'], 9)

所需的输出示例(任何日期格式都可以!):

    date
0   1/25/2016 12:00:00AM
1   2/25/2017 12:00:00AM
2   3/25/2018 12:00:00AM
3   4/25/2019 12:00:00AM
4   5/25/2019 12:00:00AM
5   6/25/2019 12:00:00AM

    date
0   1/25/2016
1   2/25/2017 
2   3/25/2018 
3   4/25/2019 
4   5/25/2019 
5   6/25/2019

3 个答案:

答案 0 :(得分:1)

将函数应用于指定列的每一行,如下所示:

df['date'].apply(lambda x: insert_space(x, 9), axis=1)

请注意,如果使用日期时间对象,则需要相应地修改函数。日期时间对象(例如datetime.time())不可下标,如果尝试通过insert_space函数运行它,则会引发TypeError。 str(datetime.time())将返回一个字符串。

答案 1 :(得分:0)

类似这样的事情


numbers = input("Enter 9 Numbers: ")
numList = (int(x) for x in numbers.split())

index = 0
for count in range(0, 10):
    if count % 3 == 0:
        print(numList[index])
    else:
        print(numList[index])
    index+=1

您可以在此处找到df['date'] = pd.to_datetime(df['date'], format="%m/%d/%Y%I:%M:%S%p") 的说明:https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior

答案 2 :(得分:0)

就目前而言,您提供的函数仅返回一个值,该值将立即被废弃。

这里是使用基本for循环的解决方案(可以简单地转换为列表理解或功能化)。

import pandas as pd

# First format
df = pd.DataFrame({
    'date': ['1/25/201612:00:00AM','2/25/201712:00:00AM','3/25/201812:00:00AM',
             '4/25/201912:00:00AM','5/25/201912:00:00AM','6/25/201912:00:00AM']})

for i in range(len(df)):
    df['date'][i] = df['date'][i][:-10] + " " + df['date'][i][-10:]

print(df)
#                     date
# 0  1/25/2016  12:00:00AM
# 1  2/25/2017  12:00:00AM
# 2  3/25/2018  12:00:00AM
# 3  4/25/2019  12:00:00AM
# 4  5/25/2019  12:00:00AM
# 5  6/25/2019  12:00:00AM

# Second format
df = pd.DataFrame({
    'date': ['1/25/201612:00:00AM','2/25/201712:00:00AM','3/25/201812:00:00AM',
             '4/25/201912:00:00AM','5/25/201912:00:00AM','6/25/201912:00:00AM']})

for i in range(len(df)):
    df['date'][i] = df['date'][i][:-10]

print(df)

#         date
# 0  1/25/2016
# 1  2/25/2017
# 2  3/25/2018
# 3  4/25/2019
# 4  5/25/2019
# 5  6/25/2019

更新:以下是对各个调用的列表理解,它们应该更加有效:

df['date'] = [v[:-10] + " " + v[-10:] for v in df['date']]
df['date'] = [v[:-10] for v in df['date']]