Question

我正在执行以下操作以在不同列中传递多个日期。但是，第二列（时间）列不符合此字符串，因此它有错误。我如何实现这一目标？

 dateparse = lambda x: pd.datetime.strptime(x, '%d/%m/%Y %H:%M:%S')

 for chunk in pd.read_csv(file, chunksize=500000, parse_dates=['date','time'], parse_dates = dateparse, names = col_names, index_col = index_cols, header = 0, dtype = dtype)
        store.append('df',chunk)

示例数据：

 Date                     Time
19/10/2016 00:00:00      00:05:01

Answer 1

如果您使用'19/10/2016 00:00:00'这样的标准格式，则无需指定日期时间格式 - Pandas会自动解析它，因此您不需要使用date_parser参数。

选项1：将Time列转换为datetime64[ns] dtype：

for chunk in pd.read_csv(file, chunksize=500000, parse_dates=['Date'], names=col_names, index_col=index_cols, dtype = dtype):
    chunk['Time'] = chunk['Date'].dt.normalize() + pd.to_timedelta(chunk['Time'])
    store.append('df',chunk)

选项2 ：将Time列转换为timedelta64[ns] dtype：

for chunk in pd.read_csv(file, chunksize=500000, parse_dates=['Date'], names=col_names, index_col=index_cols, dtype = dtype):
    chunk['Time'] = pd.to_timedelta(chunk['Time'])
    store.append('df',chunk)

PS HDFStore支持两种dtypes

选项3：

for chunk in pd.read_csv(file, chunksize=500000, names=col_names, index_col=index_cols, dtype = dtype):
    chunk['Date'] = pd.to_datetime(chunk['Date'], errors='coerce')
    chunk['Time'] = pd.to_timedelta(chunk['Time'], errors='coerce')
    store.append('df',chunk)

Answer 2

您可以告诉Pandas将日期和时间列合并为一列，方法是传递一个列表列表，而不仅仅是Filter2中Filter3中指定的列表：

parse_dates ：boolean或整数或名称列表或列表或dict列表，默认为False


布尔值。如果为True - >尝试解析索引。

整体或名称列表。例如如果[1,2,3] - >尝试将第1,2,3列分别解析为单独的日期列。

列表清单。例如如果[[1,3]] - >将第1列和第3列组合在一起并解析为   单个日期列。   dict，例如{'foo'：[1,3]} - ＆gt;将第1,3列解析为日期并调用结果'foo'

您还希望根据日期格式指定library(shiny) library(dplyr) library(DT) ui <- fluidPage( titlePanel("Title"), sidebarLayout( sidebarPanel(width=3, selectInput("filter1", "Filter 1", multiple = TRUE, choices = c("All", LETTERS)), selectInput("filter2", "Filter 2", multiple = TRUE, choices = c("All", as.character(seq.int(1, length(letters), 1)))), selectInput("filter3", "Filter 3", multiple = TRUE, choices = c("All", letters)) ), mainPanel( DT::dataTableOutput("tableprint") ) ) ) server <- function(input, output, session) { output$tableprint <- DT::renderDataTable({ # Data df <- tibble(LETTERS = rep(LETTERS, 2), Numbers = as.character(1:52), letters = paste(LETTERS, Numbers, sep = "")) df1 <- df if("All" %in% input$filter1){ df1 } else if (length(input$filter1)){ df1 <- df1[which(df1$LETTERS %in% input$filter1),] } # Update selectInput choices based on the filtered data. Update 'selected' to reflect the user input. updateSelectInput(session, "filter1", choices = c("All", df$LETTERS), selected = input$filter1) updateSelectInput(session, "filter2", choices = c("All", df1$Numbers), selected = input$filter2) if("All" %in% input$filter2){ df1 } else if (length(input$filter2)){ df1 <- df1[which(df1$Numbers %in% input$filter2),] } updateSelectInput(session, "filter3", choices = c("All", df1$letters), selected = input$filter3) if("All" %in% input$filter3){ df1 } else if (length(input$filter3)){ df1 <- df1[which(df1$letters %in% input$filter3),] } datatable(df1) }) } # Run the application shinyApp(ui = ui, server = server)。

这意味着您的代码变为

parse_dates

Pandas DataFrame / HDFStore通过CSV传递多个日期格式

2 个答案: