我有一个二元变量(生物性别),我担心线性回归中估计的符号(正或负)。在我的library(shiny)
library(highcharter)
library(dplyr)
data <- read.csv("data/daily states.csv")
ui <- fluidPage(
titlePanel("Timeline"),
sidebarLayout(
sidebarPanel(
h2("Actions", align="center"),
fluidRow(
column(5,
selectizeInput("state",
h3("State:"),
c("All",
unique(data$state))))
),
fluidRow(
column(5,
selectInput("outcome",
h3("Outcome:"),
c("All",
unique(data$variable))))
),
fluidRow(
column(5,
dateRangeInput("date",
h3("Date range"),
min = "2020-01-22",
start = "2020-01-22",
end = as.character(Sys.Date())))
),
fluidRow(
column(5,
checkboxInput("federal",
"Show federal level",
value = TRUE))
)
),
mainPanel(
tabsetPanel(type = "tabs",
tabPanel("Plot", highchartOutput("hcontainer")),
tabPanel("Table", DT::dataTableOutput("table"))),
)
)
)
server <- function(input, output, session){
newData <- reactive({
if (input$state != "All"){
data <- filter(data, state == input$state)
}
if (input$outcome != "All"){
data <- filter(data, variable == input$outcome)
}
data
})
output$table <- DT::renderDataTable(DT::datatable({
newData()
}))
output$hcontainer <- renderHighchart({
hc <- highchart(type = "chart") %>%
hc_xAxis(categories = unique(newData()$date)) %>%
hc_plotOptions(series = list(
allowPointSelect = TRUE,
cursor = "pointer",
point = list(
events = list(
click = JS( "function () { location.href = 'https://covidtracking.com/data/state/' + this.options.key + '#historical'}")
)
)
)) %>%
hc_add_series(name = (paste(input$state,input$outcome)), data = newData()$value, type = "line", mapping = hcaes(x = date, key = state, y = value))
hc
})
}
shinyApp(ui = ui, server = server)
中,女性编码为2,男性编码为1。我正在考虑对其进行重新编码,以便将女性编码为0,将男性编码为1。
在这两种情况下,我如何解释估计的符号?例如,如果我的结果是身高,那么如果女性为0且男性为1,则我期望正值值。但是,如果女性为2而男性为1,我不会期望身高的结果是否为负值?
预先感谢您的帮助! 查理
答案 0 :(得分:1)
将性别编码为分类变量(类factor
)。然后R将指定该值对应的性别。
set.seed(1234)
x = data.frame(sex = factor(sample(c("female", "male"), size = 20, replace = TRUE)),
var = rnorm(20))
lm(var ~ sex, x)
# Call:
# lm(formula = var ~ sex, data = x)
# Coefficients:
# (Intercept) sexmale
# -0.31066 0.08228
这意味着在男性中,变量var
中的值增加。
答案 1 :(得分:0)
我认为您的说法是正确的。如果您不想重新编码变量,只需在公式本身中使用as.factor(sex)
。比R知道该值不是数字,您不必担心变量的编码。
让我知道这是否有帮助或您有其他疑问:)