Skip to main content

Python Lists and Dictionaries

Introduction

When working with data in Python, particularly with pandas, understanding two core data structures is essential: lists and dictionaries. These structures form the foundation of how data is organized, accessed, and manipulated in Python.

In this tutorial, we'll explore both data structures in depth, understand their properties, and see how they relate to pandas operations. By mastering lists and dictionaries, you'll develop a solid foundation for working with more complex data structures in pandas.

Python Lists

What is a List?

A list in Python is an ordered collection of items that can be of any data type. Lists are mutable (can be modified after creation) and are defined using square brackets [].

Creating Lists

python
# Creating a simple list
numbers = [1, 2, 3, 4, 5]
print(numbers)

# List with mixed data types
mixed_list = [1, "hello", 3.14, True]
print(mixed_list)

# Empty list
empty_list = []
print(empty_list)

Output:

[1, 2, 3, 4, 5]
[1, 'hello', 3.14, True]
[]

Accessing List Elements

Lists are indexed starting from 0 for the first element.

python
fruits = ["apple", "banana", "cherry", "date", "elderberry"]

# Accessing individual elements
print(fruits[0]) # First element
print(fruits[2]) # Third element

# Negative indexing (counting from the end)
print(fruits[-1]) # Last element
print(fruits[-2]) # Second-to-last element

# Slicing lists [start:end:step]
print(fruits[1:4]) # Elements from index 1 to 3
print(fruits[:3]) # Elements from beginning to index 2
print(fruits[2:]) # Elements from index 2 to the end
print(fruits[::2]) # Every second element

Output:

apple
cherry
elderberry
date
['banana', 'cherry', 'date']
['apple', 'banana', 'cherry']
['cherry', 'date', 'elderberry']
['apple', 'cherry', 'elderberry']

List Methods and Operations

Python provides many built-in methods for working with lists:

python
fruits = ["apple", "banana", "cherry"]

# Adding elements
fruits.append("dragonfruit") # Add to the end
print(fruits)

fruits.insert(1, "blueberry") # Insert at specific position
print(fruits)

# Removing elements
fruits.remove("cherry") # Remove specific item
print(fruits)

popped_fruit = fruits.pop() # Remove last item and return it
print(f"Removed {popped_fruit}, remaining list: {fruits}")

# Other common operations
print(len(fruits)) # Length of the list
print("apple" in fruits) # Check if item exists
fruits.sort() # Sort the list in-place
print(fruits)

# Combining lists
more_fruits = ["fig", "grape"]
all_fruits = fruits + more_fruits
print(all_fruits)

Output:

['apple', 'banana', 'cherry', 'dragonfruit']
['apple', 'blueberry', 'banana', 'cherry', 'dragonfruit']
['apple', 'blueberry', 'banana', 'dragonfruit']
Removed dragonfruit, remaining list: ['apple', 'blueberry', 'banana']
3
True
['apple', 'banana', 'blueberry']
['apple', 'banana', 'blueberry', 'fig', 'grape']

List Comprehensions

List comprehensions provide a concise way to create lists:

python
# Create a list of squares
squares = [x**2 for x in range(1, 6)]
print(squares)

# List comprehension with condition
even_squares = [x**2 for x in range(1, 11) if x % 2 == 0]
print(even_squares)

Output:

[1, 4, 9, 16, 25]
[4, 16, 36, 64, 100]

Python Dictionaries

What is a Dictionary?

A dictionary is an unordered collection of data stored as key-value pairs. Dictionaries are mutable and defined using curly braces {}.

Creating Dictionaries

python
# Creating a simple dictionary
person = {
"name": "John",
"age": 30,
"city": "New York"
}
print(person)

# Dictionary with mixed data types
mixed_dict = {
"string_key": "value",
42: "answer",
"list_key": [1, 2, 3],
"nested_dict": {"inner_key": "inner_value"}
}
print(mixed_dict)

# Empty dictionary
empty_dict = {}
print(empty_dict)

# Alternative way using dict()
person2 = dict(name="Alice", age=25, city="Boston")
print(person2)

Output:

{'name': 'John', 'age': 30, 'city': 'New York'}
{'string_key': 'value', 42: 'answer', 'list_key': [1, 2, 3], 'nested_dict': {'inner_key': 'inner_value'}}
{}
{'name': 'Alice', 'age': 25, 'city': 'Boston'}

Accessing Dictionary Elements

Dictionary elements are accessed using keys:

python
person = {
"name": "John",
"age": 30,
"city": "New York"
}

# Accessing values using keys
print(person["name"])
print(person["age"])

# Using get() method (safer, returns None if key doesn't exist)
print(person.get("city"))
print(person.get("email")) # Key doesn't exist
print(person.get("email", "Not available")) # With default value

Output:

John
30
New York
None
Not available

Dictionary Methods and Operations

python
student = {
"name": "Alice",
"courses": ["Math", "Physics"],
"grades": {
"Math": 95,
"Physics": 88
}
}

# Adding or updating elements
student["age"] = 20
student["courses"].append("Chemistry")
print(student)

# Updating multiple key-value pairs
student.update({"age": 21, "semester": "Fall"})
print(student)

# Removing elements
removed_grades = student.pop("grades")
print(f"Removed: {removed_grades}")
print(student)

# Other common operations
print(len(student)) # Number of key-value pairs
print("name" in student) # Check if key exists
print(list(student.keys())) # Get all keys
print(list(student.values())) # Get all values
print(list(student.items())) # Get all key-value pairs

Output:

{'name': 'Alice', 'courses': ['Math', 'Physics', 'Chemistry'], 'grades': {'Math': 95, 'Physics': 88}, 'age': 20}
{'name': 'Alice', 'courses': ['Math', 'Physics', 'Chemistry'], 'grades': {'Math': 95, 'Physics': 88}, 'age': 21, 'semester': 'Fall'}
Removed: {'Math': 95, 'Physics': 88}
{'name': 'Alice', 'courses': ['Math', 'Physics', 'Chemistry'], 'age': 21, 'semester': 'Fall'}
4
True
['name', 'courses', 'age', 'semester']
['Alice', ['Math', 'Physics', 'Chemistry'], 21, 'Fall']
[('name', 'Alice'), ('courses', ['Math', 'Physics', 'Chemistry']), ('age', 21), ('semester', 'Fall')]

Dictionary Comprehensions

Similar to list comprehensions, dictionary comprehensions provide a concise way to create dictionaries:

python
# Create a dictionary of squares
squares_dict = {x: x**2 for x in range(1, 6)}
print(squares_dict)

# Dictionary comprehension with condition
even_squares_dict = {x: x**2 for x in range(1, 11) if x % 2 == 0}
print(even_squares_dict)

Output:

{1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
{2: 4, 4: 16, 6: 36, 8: 64, 10: 100}

Connection to Pandas

Understanding lists and dictionaries is crucial when working with pandas because:

  1. Series creation: A pandas Series can be created from a Python list
  2. DataFrame creation: A pandas DataFrame can be created from:
    • A dictionary of lists (each list becomes a column)
    • A list of dictionaries (each dictionary becomes a row)

Let's see examples of each:

python
import pandas as pd

# Creating a Series from a list
numbers = [10, 20, 30, 40, 50]
series = pd.Series(numbers)
print("Series from list:")
print(series)
print()

# Creating a DataFrame from a dictionary of lists
data_dict = {
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35],
"City": ["New York", "San Francisco", "Boston"]
}
df1 = pd.DataFrame(data_dict)
print("DataFrame from dictionary of lists:")
print(df1)
print()

# Creating a DataFrame from a list of dictionaries
data_list = [
{"Name": "Alice", "Age": 25, "City": "New York"},
{"Name": "Bob", "Age": 30, "City": "San Francisco"},
{"Name": "Charlie", "Age": 35, "City": "Boston"}
]
df2 = pd.DataFrame(data_list)
print("DataFrame from list of dictionaries:")
print(df2)

Output:

Series from list:
0 10
1 20
2 30
3 40
4 50
dtype: int64

DataFrame from dictionary of lists:
Name Age City
0 Alice 25 New York
1 Bob 30 San Francisco
2 Charlie 35 Boston

DataFrame from list of dictionaries:
Name Age City
0 Alice 25 New York
1 Bob 30 San Francisco
2 Charlie 35 Boston

Real-World Example: Data Analysis

Let's see how lists and dictionaries can be used in a simple data analysis task:

python
# Sales data for a small business
sales_data = [
{"date": "2023-01-01", "product": "Widget", "units": 100, "price": 10.0},
{"date": "2023-01-01", "product": "Gadget", "units": 50, "price": 20.0},
{"date": "2023-01-02", "product": "Widget", "units": 120, "price": 10.0},
{"date": "2023-01-02", "product": "Gadget", "units": 45, "price": 20.0},
{"date": "2023-01-03", "product": "Widget", "units": 90, "price": 10.0},
{"date": "2023-01-03", "product": "Gadget", "units": 60, "price": 20.0},
]

# Calculate total revenue per day
daily_revenue = {}

for sale in sales_data:
date = sale["date"]
revenue = sale["units"] * sale["price"]

if date in daily_revenue:
daily_revenue[date] += revenue
else:
daily_revenue[date] = revenue

print("Daily Revenue:")
for date, revenue in daily_revenue.items():
print(f"{date}: ${revenue:.2f}")

# Calculate total units sold per product
product_units = {}

for sale in sales_data:
product = sale["product"]
units = sale["units"]

if product in product_units:
product_units[product] += units
else:
product_units[product] = units

print("\nTotal Units Sold:")
for product, units in product_units.items():
print(f"{product}: {units} units")

# Convert to pandas for further analysis
import pandas as pd

# Create DataFrame from sales data
sales_df = pd.DataFrame(sales_data)

# Calculate revenue for each record
sales_df["revenue"] = sales_df["units"] * sales_df["price"]

print("\nSales DataFrame:")
print(sales_df)

print("\nSummary by Product:")
print(sales_df.groupby("product").sum()[["units", "revenue"]])

Output:

Daily Revenue:
2023-01-01: $2000.00
2023-01-02: $2100.00
2023-01-03: $2100.00

Total Units Sold:
Widget: 310 units
Gadget: 155 units

Sales DataFrame:
date product units price revenue
0 2023-01-01 Widget 100 10.0 1000.0
1 2023-01-01 Gadget 50 20.0 1000.0
2 2023-01-02 Widget 120 10.0 1200.0
3 2023-01-02 Gadget 45 20.0 900.0
4 2023-01-03 Widget 90 10.0 900.0
5 2023-01-03 Gadget 60 20.0 1200.0

Summary by Product:
units revenue
product
Gadget 155 3100.0
Widget 310 3100.0

Summary

In this tutorial, we covered:

  • Lists: ordered, mutable collections that can contain elements of different data types
  • Dictionaries: unordered collections of key-value pairs that allow fast lookup by key
  • List and Dictionary Operations: creating, accessing, modifying, and using built-in methods
  • List and Dictionary Comprehensions: concise ways to create these data structures
  • Connection to Pandas: how lists and dictionaries form the foundation for pandas Series and DataFrames
  • Real-World Example: using lists and dictionaries for basic data analysis tasks

Both lists and dictionaries are fundamental Python data structures that you'll use extensively when working with pandas. Lists provide ordered sequences of elements, while dictionaries offer fast lookup by key. Understanding these core structures will make working with pandas more intuitive and effective.

Exercises

  1. Create a list of 10 numbers and use a list comprehension to generate a new list containing only the even numbers.
  2. Create a dictionary that maps country names to their capitals. Write code to print all countries and their capitals.
  3. Given a list of dictionaries representing students (with keys 'name', 'age', 'grade'), write code to find the average grade.
  4. Convert a dictionary of student grades ({"Alice": 95, "Bob": 82, "Charlie": 88}) to a pandas Series.
  5. Create a pandas DataFrame from a dictionary of lists where each list represents student exam scores for different subjects.

Additional Resources

Happy coding, and enjoy exploring Python's powerful data structures!



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)