# Regression model

## Binary logit/probit model

\begin{aligned} P(y=1) &= P(\sum \beta x + \epsilon > 0) \\ &= P(\epsilon > -\sum \beta x) \\ &= 1 - F(-\sum \beta x) \\ &= F(\sum \beta x) \end{aligned}

logit CDF: $F(x)=\frac{e^x}{1+e^x}$
probit CDF: $F(x)=\phi(x)=\int ^{x} _{-\infty}{ \frac{1}{ \sqrt{2 \pi} } e^{\frac{-z^2}{2}}}dz$

### marginal effect

#### logit

odds is $\frac{P}{1-P}=e^{\sum \beta x}$
odds ratio is $\frac{odds(a+1)}{odds(a)}=e^{\beta x}$

#### probit

$\Phi^{-1}(p) = \sum \beta x$
$\Phi^{-1}_{a=1} - \Phi^{-1}_{a=0} = \beta a$

## Order logit/probit model

Dependant variables are ordered values((ex) worst worse normal good very good)

\begin{aligned} P(y=1) &= P(y^* \leq 0) = P(\varepsilon \leq -\sum \beta x) = F( -\sum \beta x) \\ P(y=2) &= P(0 \leq y \leq \mu_2) = P(-\sum \beta x < \varepsilon < \mu_2 - \sum \beta x) = F (\mu_2 - \sum \beta x) - F(-\sum \beta x) \end{aligned}

### marginal effect

#### logit

odds is $\frac{P(y=1)}{P(y \neq 1)}$ odds ratio is $\frac{odds(a=1)}{odds(a=0)}=e^{\beta a}$

#### probit

$\Phi^{-1}(p(y=1)) = z$
$\Phi^{-1}_{a=1} - \Phi^{-1}_{a=0} = \beta a$

### boundary value

#### logit

odds is $\frac{P(y\leq i)}{P(y > i)}$ odds ratio is $\frac{odds(a=1)}{odds(a=0)}=e^{\mu i - \beta a}$

#### probit

$\Phi^{-1}(P(y \leq i)) - \Phi^{-1}(P( y \leq i-1)) = \mu i$

## Multinominal logit model

In case of the number of independent variables is greater than 3.

$\frac{p_j}{p_j + p_J} = F(\sum_{k} \beta_{jk} x_k)$

where p_J is a reference variable and $p_j = \frac{\text{j variable}}{p_J + \text{j variable}}$

$\frac{p_j}{p_J}=e^{\sum \beta_{jk} x_k}$

$\sum_j \frac{p_j}{p_J} = \frac{1-p_j}{p_J}=\frac{1}{p_J}-1=\sum_i e^{\beta_{jk}x_k}$

$p_J=\frac{1}{1 + \sum_j e^{\sum \beta_{jk} x_k}}$

$p_j=\frac{e^{\sum \beta_{jk} x_k}}{1 + \sum_j e^{\sum \beta_{jk} x_k}}$

## Nested logit/probit model

independent variable is hierarchy structure

graph TD; A[buy car] --> B[used] A --> B'[new] C[no buy car] --> D[used] C --> D'[new]
graph LR; A[F] --> A1[F_1] A --> A2[1-F_1] A1 --> B1[F_k12] A1 --> B2[1-F_k12] A2 --> B3[f_k22] A2 --> B4[1-f_k22]

### marginal effect

#### logit

odds is $\frac{F_{ki}}{1-F_{ki}}$
odds ratio is $\frac{odds(a=1)}{odds(a=0}=\beta a$

#### probit

$\Phi^{-1}(P(y = 1)) = \sum \beta x$
$\Phi^{-1}_{a=1} - \Phi^{-1}_{a=0} = \beta a$

## Conditional model

In case independent variables is changed by depedent variables ex: independent variables -> cost, time by seoul, gyunggi
dependent variables -> bus train car

$p(\text{bus})=\frac{e^{\text{bus}}}{e^{\text{train}}+e^{\text{car}}+e^{\text{bus}}}$

$e^{\beta a}$: when a variable increase 1, the increase ratio reference variable(car) to comparable variable(bus)

### marginal effect

\begin{aligned}\frac{\partial P_{j^*}}{\partial z_{j}}&=\frac{\partial}{\partial z_{jk} } \frac{e^{\sum a z}}{\sum e^{az}} \quad (Q=\sum e^{az})\\ &=\frac{\alpha e^{\sum e^{az}} Q - \alpha e^{2 \sum e^{az}} }{Q^2} \\ &=\frac{ \alpha e^{\sum \alpha z}}{Q} (\frac{Q - e^{e^{\sum \alpha z}}}{Q}) \\ &= \alpha_k P_j(1- P_j) \quad (P =\frac{ e^{\sum \alpha z}}{Q} ) \end{aligned}

## Ranked logit model

In case of ranked dependent variables

### Marginal effect

$e^{\beta a} = \frac{\lambda}{\lambda_0} \quad \lambda \text{: Hazard function}$

### probability

$\displaystyle P(u_{r_1} > u_{r_2} \ldots)=\prod_{j=1}^{J-1} \frac{ e^{V_j}}{\sum e^{V_k}}$
where $V_j = \beta_j x_i + \alpha z_j + \theta w_{ij}$

### rank

$\exp(\sum_{i \neq \alpha} \beta_i x_i + \alpha \sum z_j)$ where $\alpha$ is means, $z_j$

### Example probability

the probability of a(1st) b(sec) c(thd):
$a \cdotp b \cdotp c =\frac{\exp(P(a))}{\exp(a)+\exp(b)+\exp(c)} \cdotp \frac{\exp(P(b))}{\exp(b)+\exp(c)}\cdotp \frac{\exp(P(c))}{\exp(c)}$

1. cumulative distribution function ↩︎

2. the ratio for probability to independent variable derivative ↩︎