ABOUT ME

이 곳은 data scientist가 되기 위한 저의 과정들을 공유하는 공간입니다. 저랑 비슷한 목표를 갖고 있는 사람이 있다면 여기에 있는 저의 내용이 조금이나마 도움이 되면 좋을 거 같네요 :)

Today
Yesterday
Total
  • Categorical response variable
    Inferential Statistics from Amsterdam 2021. 6. 22. 09:43

    In this blog, i'm going to talk about categorical response variable.

     

    1. Categorical response variable : Until now, I looked at regression models that describe or predict a quantitative variable, but there are also regression models for ordinal and nominal response variable.

     1-1) logistic function : The logistic function has a sigmoid shape or s curve. This produces estimated values that lie between zero and one. This means that the logistic function provides the probability.

     1-2) The equation of the logistic function : Here's the shape how the logistic function looks like and the equation that describes the function.

    I wrote the equation at the population level. But If you want to use this equation at sample level, you can change the alpha and beta to just a and b. And I drew two cases of the logistic function. The first case uses when the value of the beta is positive. The second case is used when the beta is less than 0.

    And depending on the absolute value of beta, the steepness of the graph is changed. If the value gets smaller, then the graph goes up somewhat loosely, but if the value gets biggers, the graph goes up rapidly at some point.

    And this equation is for determination of the inflection point on the graph.

     1-3) The way how to change the logistic graph into the straight line : It's an ideal function to describe when you have a categorical response variable. While the logistic function has some nice features, by transforming it, you can change it into a straight line and use the formulas for lienar regression to calculate the intercept and regression coefficient.

    ** Odds is the probability of pass or true divided by the probability of not pass or true.

    ** The log odds regression coefficient estimate value represents the linear change in log odds with a one unit change in the predictor.

    ** The regression coefficient expressed in terms of odds is also called an odds ratio.

    ** The percentage of rejected videos that were correctly predicted to be rejected is called the specificity of our model.

    **The percentage of selected videos that were correctly predicted to be selected is called the sensitivity.

Designed by Tistory.