Introduction to Statistics

dharmendra mishra
3 min readJun 3, 2019

Statistics can be defined as the collection and interpretation of data.

all around the world, we use statistics to measure variability. people have different height, weight, hair color, and food preferences, these things are all variable because they change among different individuals.

There are two kinds of statistics.

Inferential statistics: Inferential statistics deals with taking a sample and analyzing that samples to make judgments or claims about a population.

Descriptive statistics: Descriptive statistics refers to getting data and talking about it, so when you hear a professor say something like the average midterm score was 65% they are using descriptive statistics. we often use things like histograms and graphs to help us summarize and explain descriptive statistics.

In order to understand statistics, you’ll first have to know some basic definitions.

A population refers to the total amount of “things”. I say things because of a population can refer to almost anything this can refer to the total amount of people cats vehicles houses and so on.

Now a Sample refers to a small part of the population that is used for study and the total amount of things in a sample is called the sample size.

In statistics what we examine is a variable, it is what we are studying and it can be measurable countable and categorized.

When we talked about how people can have different heights weights and hair color these are all variables, the variables represents a characteristic of what we are trying to study and they can vary among different individuals.

when we measure a variable a data can come into two different forms.

Quantitative Data: Data that is measured in numbers, it deals with numbers that make sense to perform arithmetic calculations with, like calculating an average. Quantitative data comes from quantitative variables examples include height weight and midterm score.

Categorical Data: Refers to the value that place “things” into different groups or categories. Categorical data comes from categorical variables, examples include hair color, type of cat and letter grade.

There are actually two types of categorical variables.

Categorical and ordinal: Something is set to be categorical and ordinal if there is a logical ordering to the values of a categorical variable, a good example of this would letter grade, we can logically order the values of this categorical variable from high to low or from low to high.

Categorical and nominal: Something is set to be categorical and nominal if there is no logical ordering to the values of a categorical variable, an example of this would be hair color, depending on a sample we could have people with red hair, blond hair, brown hair or even blue hair although we can arrange these values in alphabetical order there is no logical ordering with respect to the actual values itself.

There are also two types of quantitative variables.

Discrete Variables: Discrete variables refer to variables that can only be measured in certain numbers, an example of this is the certain numbers, an example of this is the number of pets you own, you can own zero pet, one pet, two pets or even third pets, but it’s impossible for us to own 2.7 pets.

Continuous Variables: Contrast continuous variables refer to variables that can take on any numerical value an example of this would be weight, someone can weight 105 pounds, 185 pounds or even 170.683 pounds we can measure this variable in as many decimals places as we want which is why it is classified as a continuous variable.

So to recap:

A population refers to the total number of things. A sample refers to the small part of the population that we examine and extract information from the total number of things in a sample. the total number of things in a sample is called a sample size, what we measure from each individual is the variable of interest, the way we measure these variables lets us know if the variable is quantitative or categorical, for example, if her variable of interest was midterm scores for statistics we would have quantitative data, if we measure each individual test score if instead, we decide to place people into categories, based on letter grade, then we would be working with categorical data.

--

--

dharmendra mishra

Data-driven Analytics/Engineering leader with 12+ years of experience in digital advertising company. Skills SQL, Python, Excel and GCP Google Cloud Platform.