Before diving into complex tasks like natural language processing (NLP) or interacting with APIs, we need to understand the basics of how Python handles data. At the core of any programming language are variables and data types. Variables allow us to store data, and data types define the kind of data we are working with (like numbers, text, or booleans). In Python, we do not need to explicitly declare the type of a variable – Python figures it out based on what we assign to it.
Let’s take a look at the most important data types in Python, and how we can use them.
Variables
In Python, variables are used to store information that can be referenced or manipulated later. You create a variable by simply assigning a value to it with the = operator. Once defined, you can use the variable by referring to its name.
Example:
# Assigning values to variablesmy_number =10# an integermy_float =3.14# a floating-point numbermy_text ="Hello, GPT!"# a string of textis_valid =True# a boolean (True/False)# Printing the variablesprint(my_number) # Output: 10print(my_float) # Output: 3.14print(my_text) # Output: Hello, GPT!print(is_valid) # Output: True
10
3.14
Hello, GPT!
True
In this example:
my_number is storing an integer value (whole number).
my_float is storing a floating-point number (a number with decimal points).
my_text stores a string, which is simply a sequence of characters.
is_valid is a boolean, which can either be True or False.
Data Types
Python recognizes several different data types, but let’s focus on the ones you’ll use the most often for basic NLP tasks:
Integers (int): Whole numbers, like -1, 0, and 100.
Floats (float): Numbers with decimal points, like 3.14 or -0.001.
Strings (str): Sequences of characters, like "OpenAI" or "Hello World".
Booleans (bool): True or False values, which are often used in conditional statements.
Working with Numbers (Integers and Floats)
Numbers are straightforward to work with in Python. You can perform basic arithmetic like addition, subtraction, multiplication, and division. The type of number (integer or float) depends on whether it has decimal points.
Example:
# Basic arithmetic operationsa =10b =4sum_result = a + b # Additiondifference = a - b # Subtractionproduct = a * b # Multiplicationquotient = a / b # Division (always returns a float)print(sum_result) # Output: 14print(difference) # Output: 6print(product) # Output: 40print(quotient) # Output: 2.5
14
6
40
2.5
In this example, a and b are integers, and we perform basic arithmetic with them. Notice that even though a and b are integers, the result of division is a float (2.5).
Working with Strings
Strings are used to represent text in Python. Strings can be created using single (') or double (") quotes. Python provides a wide range of operations you can perform on strings, like concatenation (joining two strings) or accessing individual characters.
Example:
# Concatenating stringsfirst_name ="John"last_name ="Doe"full_name = first_name +" "+ last_name # Concatenating with a space in betweenprint(full_name) # Output: John Doe# String interpolation using f-stringsage =25print(f"My name is {full_name} and I am {age} years old.")# Output: My name is John Doe and I am 25 years old.
John Doe
My name is John Doe and I am 25 years old.
In the example above: - We concatenate the first_name and last_name strings to form the full_name. - We also use f-strings (a modern and convenient way to insert variables into a string) to print a message with the person’s name and age.
Booleans
Booleans represent truth values (True or False) and are often used to control the flow of programs. They commonly result from comparison operations (e.g., checking if one number is greater than another).
Example:
# Boolean valuesis_student =Trueis_teacher =False# Comparison operationsage =18is_adult = age >=18# True if age is greater than or equal to 18print(is_adult) # Output: True
True
Here, we create two boolean variables (is_student and is_teacher) and then use a comparison (>=) to check if age qualifies as an adult (the result is stored in is_adult).
Checking Data Types
Python allows you to check the type of a variable using the type() function. This is useful when you want to confirm what kind of data a variable is holding.
This will print the type of each variable, helping you understand what type of data each one holds.
Variables and data types are the building blocks of any program. In Python, variables can store different types of data, such as integers, floating-point numbers, strings, and booleans. We’ve seen how to create and manipulate these data types, and how to perform simple operations on them. These concepts will come in handy as we start working with more complex tasks like text processing.
Next, we’ll explore how to manipulate strings further, as handling text is a core part of natural language processing.