熊猫中的日期格式

时间:2020-03-02 18:18:20

标签: python python-3.x pandas datetime format

我正在尝试将日期格式更改为“月年”格式,而不更改非日期值。

import matplotlib.pyplot as plt
import nltk # Natural Language ToolKit
nltk.download('stopwords')
from nltk.corpus import stopwords # to get rid of StopWords 

from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator # to create a Word Cloud
from PIL import Image # Pillow with WordCloud to image manipulation

text = 'New stop words are bad for this text.'

# Adding the stopwords
stop_words = stopwords.words('en') 
new_stopwors = ['new', 'stop', 'words']
stop_words.extend(new_stopwords)
stop_words = set(stop_words)

# Getting rid of the stopwords
clean_text = [word for word in text.split() if word not in stop_words]

# Converting the list to string
text = ' '.join([str(elem) for elem in clean_text])

# Generating a wordcloud
wordcloud = WordCloud(background_color = "black").generate(text)

# Display the generated image:
plt.figure(figsize = (15, 10))
plt.imshow(wordcloud, interpolation = 'bilinear')
plt.axis("off")
plt.show()

input_df是

enter image description here

预期输出为:

enter image description here

我对以下无效的代码感到厌倦:

input_df = pd.DataFrame({'Period' :['2017-11-01 00:00:00', '2019-02-01 00:00:00', 'Mar 2020', 'Pre-Nov 2017', '2019-10-01 00:00:00' , 'Nov 17-Nov 18'] } )

请帮助。

1 个答案:

答案 0 :(得分:4)

您可以使用error='coerce'fillna

input_df['new_period'] = (pd.to_datetime(input_df['Period'], errors='coerce')
       .dt.strftime('%b %Y')
       .fillna(input_df['Period'])
    )

输出:

                Period     new_period
0  2017-11-01 00:00:00       Nov 2017
1  2019-02-01 00:00:00       Feb 2019
2             Mar 2020       Mar 2020
3         Pre-Nov 2017   Pre-Nov 2017
4  2019-10-01 00:00:00       Oct 2019
5        Nov 17-Nov 18  Nov 17-Nov 18

更新:第二个更安全的选择:

s = pd.to_datetime(input_df['Period'], errors='coerce')

input_df['new_period'] = np.where(s.isna(), input_df['Period'], 
                                  s.dt.strftime('%b %Y'))
相关问题