Member-only story

Encoding Categorical Variables

Label Encoding and One Hot Encoding

Swati Sinha
WiCDS

Photo by Bailey Granneman on Unsplash

EDA is one of the important steps in any Data Science journey. We have a common doubt while performing EDA; on how to handle the categorical variables. Most of the ML algorithms support numerical variables. Hence, it is required to convert the object or categorical variables into numeric form that the algorithm can understand.

We will discuss the most widely used techniques for the categorical variable’s conversion.

1. Label Encoding

2. One-hot Encoding

Let’s explore the same with the below dataset of Passenger.

This dataset has various columns i.e. name, gender, age, package, TicketCost and Destination.

The attributes Name, Gender, Package and Destination are object data type i.e. categorical type.

Label Encoding

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

WiCDS
WiCDS

Published in WiCDS

A collaborative community for Women in Data Science and Programming to learn and grow

No responses yet

Write a response