R?中线性回归的二元变量

时间:2020-07-07 14:24:56

标签: r regression

我有一个二元变量(生物性别),我担心线性回归中估计的符号(正或负)。在我的library(shiny) library(highcharter) library(dplyr) data <- read.csv("data/daily states.csv") ui <- fluidPage( titlePanel("Timeline"), sidebarLayout( sidebarPanel( h2("Actions", align="center"), fluidRow( column(5, selectizeInput("state", h3("State:"), c("All", unique(data$state)))) ), fluidRow( column(5, selectInput("outcome", h3("Outcome:"), c("All", unique(data$variable)))) ), fluidRow( column(5, dateRangeInput("date", h3("Date range"), min = "2020-01-22", start = "2020-01-22", end = as.character(Sys.Date()))) ), fluidRow( column(5, checkboxInput("federal", "Show federal level", value = TRUE)) ) ), mainPanel( tabsetPanel(type = "tabs", tabPanel("Plot", highchartOutput("hcontainer")), tabPanel("Table", DT::dataTableOutput("table"))), ) ) ) server <- function(input, output, session){ newData <- reactive({ if (input$state != "All"){ data <- filter(data, state == input$state) } if (input$outcome != "All"){ data <- filter(data, variable == input$outcome) } data }) output$table <- DT::renderDataTable(DT::datatable({ newData() })) output$hcontainer <- renderHighchart({ hc <- highchart(type = "chart") %>% hc_xAxis(categories = unique(newData()$date)) %>% hc_plotOptions(series = list( allowPointSelect = TRUE, cursor = "pointer", point = list( events = list( click = JS( "function () { location.href = 'https://covidtracking.com/data/state/' + this.options.key + '#historical'}") ) ) )) %>% hc_add_series(name = (paste(input$state,input$outcome)), data = newData()$value, type = "line", mapping = hcaes(x = date, key = state, y = value)) hc }) } shinyApp(ui = ui, server = server) 中,女性编码为2,男性编码为1。我正在考虑对其进行重新编码,以便将女性编码为0,将男性编码为1。

在这两种情况下,我如何解释估计的符号?例如,如果我的结果是身高,那么如果女性为0且男性为1,则我期望正值值。但是,如果女性为2而男性为1,我不会期望身高的结果是否为负值?

预先感谢您的帮助! 查理

2 个答案:

答案 0 :(得分:1)

将性别编码为分类变量(类factor)。然后R将指定该值对应的性别。

set.seed(1234)
x = data.frame(sex = factor(sample(c("female", "male"), size = 20, replace = TRUE)), 
               var = rnorm(20))
lm(var ~ sex, x)

# Call:
# lm(formula = var ~ sex, data = x)

# Coefficients:
# (Intercept)      sexmale  
#    -0.31066      0.08228  

这意味着在男性中,变量var中的值增加。

答案 1 :(得分:0)

我认为您的说法是正确的。如果您不想重新编码变量,只需在公式本身中使用as.factor(sex)。比R知道该值不是数字,您不必担心变量的编码。

让我知道这是否有帮助或您有其他疑问:)