Is Python good for beginners in 2026?

Python remains the top recommendation for beginners. Clean syntax, extensive documentation, interactive REPL, and it is used professionally in AI, DevOps, web development, and data science.

What can I build with Python?

Web APIs (FastAPI, Flask, Django), automation scripts, data analysis (pandas), machine learning models, DevOps tooling (boto3, Docker SDK), web scraping, and cloud automation.

Python vs JavaScript — which should I learn first?

Python for data, AI, or DevOps paths. JavaScript for web development. Python has cleaner syntax for beginners. Both are excellent choices with strong job markets in 2026.

How long does it take to learn Python?

Basic Python (variables, loops, functions, files) takes 4-6 weeks with daily practice. Building real projects confidently takes 3-6 months. Specialising in one area (DevOps, data, web) takes 6-12 months.

What Python version should I use?

Always Python 3.11 or 3.12. Python 2 reached end-of-life in 2020. Use pyenv to manage multiple Python versions and always create a virtual environment per project.

Python Full Tutorial Part 6 - Data Structures and

Python Full Tutorial -- Part 6: Data Structures

By Suraj Ahir 2025-12-21 11 min read

← Part 5Python Tutorial · Part 6 of 12Part 7 →

Python Full Tutorial -- Part 6: Data Structures

Knowing Python data structures superficially -- that lists exist, that dicts exist -- is enough to write beginner code. Using them well -- choosing a set for fast lookups, using Counter to count occurrences, knowing when to use deque instead of list -- is what professional Python looks like. This part covers the patterns you see in real codebases.

Advanced List Techniques

Sorting, filtering, and manipulating lists

people = [
    {"name": "Suraj", "age": 25, "city": "Mumbai"},
    {"name": "Priya", "age": 28, "city": "Bangalore"},
    {"name": "Raj",   "age": 22, "city": "Delhi"}
]

# Sort by age
by_age = sorted(people, key=lambda p: p["age"])
print([p["name"] for p in by_age])  # [Raj, Suraj, Priya]

# Sort by multiple fields
by_city_age = sorted(people, key=lambda p: (p["city"], p["age"]))

# Filter
mumbai_people = [p for p in people if p["city"] == "Mumbai"]

# Remove duplicates preserving order
nums = [3, 1, 2, 1, 3, 4]
seen = set()
unique = [x for x in nums if not (x in seen or seen.add(x))]
print(unique)  # [3, 1, 2, 4]

Dictionary Techniques

Advanced dictionary patterns

from collections import defaultdict, Counter

# Count word frequencies
words = "the cat sat on the mat the cat".split()
counter = Counter(words)
print(counter.most_common(3))  # [(the, 3), (cat, 2), (sat, 1)]

# defaultdict -- no KeyError for missing keys
scores = defaultdict(list)
scores["math"].append(90)
scores["math"].append(85)
scores["science"].append(92)

# Merge dicts (Python 3.9+)
defaults = {"timeout": 30, "debug": False, "retries": 3}
overrides = {"timeout": 60, "debug": True}
config = defaults | overrides
print(config)  # {timeout: 60, debug: True, retries: 3}

Sets for Fast Lookups

Set operations

required = {"Python", "Docker", "Kubernetes"}
have     = {"Python", "Docker", "Linux"}

missing  = required - have   # {Kubernetes}
common   = required & have   # {Python, Docker}
all_s    = required | have   # All unique

# O(1) membership testing -- use set not list
valid_domains = {"gmail.com", "yahoo.com", "outlook.com"}
if "gmail.com" in valid_domains:
    print("Valid domain")

collections Module

Specialised data structures

from collections import deque, namedtuple, OrderedDict

# deque: efficient queue (O(1) both ends)
queue = deque()
queue.append("first")
queue.append("second")
print(queue.popleft())  # first

# Stack with deque
stack = deque()
stack.append("page1")
stack.append("page2")
print(stack.pop())  # page2 (LIFO)

# namedtuple: tuple with named fields
Point = namedtuple("Point", ["x", "y"])
p = Point(3, 4)
print(p.x, p.y)  # 3 4
print(p)         # Point(x=3, y=4)

Frequently Asked Questions

When to use list vs set?

List when order matters or you need index access. Set when you need unique values or fast O(1) membership testing. Converting a list to set removes duplicates in O(n).

How to sort a list of dicts?

sorted(people, key=lambda p: p["age"]). For descending: reverse=True. Multiple keys: key=lambda p: (p["city"], p["age"]).

What is Counter?

Counter from collections automatically counts occurrences of each item. Counter(["a","b","a","c","a"]) gives Counter({a: 3, b: 1, c: 1}). Use .most_common(n) to get top n items.

How do I merge two dicts?

Python 3.9+: merged = dict1 | dict2. Python 3.5+: {**dict1, **dict2}. Right-side values win for duplicate keys.

What is defaultdict?

defaultdict from collections provides a default value for missing keys instead of raising KeyError. defaultdict(list) creates an empty list for any new key, making it perfect for grouping data.

In Part 7, we cover functions -- the most critical skill for writing reusable, maintainable Python.

Key takeaways

Always use `with open(...) as f:` — context managers close the file even when exceptions happen. Manual `f.close()` is a bug waiting to fire.
`pathlib.Path` is better than `os.path`. New code, use pathlib. The `os.path` style is legacy at this point.
Catch specific exceptions, not bare `except:`. Bare except swallows KeyboardInterrupt and hides bugs.
Errors should be raised, not silently logged-and-continued. Let things crash early and clearly; that's how you ship reliable software.

Part 7 — Object-Oriented Python

Classes done right.

→

Written by

Suraj Ahir

Cloud & DevOps engineer running four live production services on my own AWS infrastructure. I write everything on this site myself — no ghostwriters, no AI filler.

More about me → GitHub LinkedIn

← Part 5Python Tutorial · Part 6 of 12Part 7 →

← Back to Blog

Disclaimer: Educational content only. No guarantees of outcome.

heapq for Priority Queues

Efficient min/max heap operations

import heapq

# Find top N items efficiently
numbers = [34, 12, 67, 5, 89, 23, 45, 78, 2, 56]
top_3 = heapq.nlargest(3, numbers)   # [89, 78, 67]
bot_3 = heapq.nsmallest(3, numbers)  # [2, 5, 12]

# Priority queue for task scheduling
tasks = []
heapq.heappush(tasks, (3, "low priority task"))
heapq.heappush(tasks, (1, "CRITICAL task"))
heapq.heappush(tasks, (2, "medium task"))

while tasks:
    priority, task = heapq.heappop(tasks)
    print(f"Executing (priority {priority}): {task}")

ChainMap for Configuration Merging

Overlay configs without mutating dicts

from collections import ChainMap

defaults = {"timeout": 30, "retries": 3, "debug": False, "region": "us-east-1"}
env_config = {"timeout": 60, "debug": True}
cli_args = {"region": "ap-south-1"}

# cli_args overrides env_config overrides defaults
config = ChainMap(cli_args, env_config, defaults)
print(config["timeout"])  # 60 (from env_config)
print(config["region"])   # ap-south-1 (from cli_args)
print(config["retries"])  # 3 (from defaults)

Efficient Data Processing with itertools

Built-in tools for iteration

from itertools import groupby, chain, islice, product, combinations
from collections import Counter, defaultdict

# Group data by a field
students = [
    {"name": "Suraj", "grade": "A"},
    {"name": "Raj", "grade": "B"},
    {"name": "Priya", "grade": "A"},
    {"name": "Anita", "grade": "B"},
]

# Sort first, then group
students.sort(key=lambda s: s["grade"])
for grade, group in groupby(students, key=lambda s: s["grade"]):
    names = [s["name"] for s in group]
    print(f"Grade {grade}: {', '.join(names)}")

# Flatten nested lists
nested = [[1, 2], [3, 4], [5, 6]]
flat = list(chain.from_iterable(nested))
print(flat)  # [1, 2, 3, 4, 5, 6]

# Process in batches of N
def batched(iterable, n):
    it = iter(iterable)
    while batch := list(islice(it, n)):
        yield batch

for batch in batched(range(1, 25), 5):
    print(f"Processing: {batch}")

Sorting Complex Data Structures

Advanced sorting patterns

from dataclasses import dataclass, field
from typing import List
import heapq

@dataclass
class Student:
    name: str
    score: float
    grade: str
    
students = [
    Student("Suraj", 95.5, "A"),
    Student("Raj", 87.0, "B"),
    Student("Priya", 95.5, "A"),
    Student("Anita", 72.0, "C"),
]

# Sort by score descending, then name ascending
sorted_s = sorted(students, key=lambda s: (-s.score, s.name))
for s in sorted_s:
    print(f"{s.name}: {s.score} ({s.grade})")

# Top 3 by score
top3 = heapq.nlargest(3, students, key=lambda s: s.score)
print([s.name for s in top3])

# Group by grade
from itertools import groupby
by_grade = {}
for s in sorted(students, key=lambda x: x.grade):
    grade = s.grade
    by_grade.setdefault(grade, []).append(s.name)
print(by_grade)

Memory-Efficient Data Processing

Generators vs lists for large data

import sys

# List: loads everything into memory
squares_list = [x**2 for x in range(1_000_000)]
print(f"List size: {sys.getsizeof(squares_list):,} bytes")  # ~8,000,000 bytes

# Generator: lazy, generates one at a time
squares_gen = (x**2 for x in range(1_000_000))
print(f"Generator size: {sys.getsizeof(squares_gen):,} bytes")  # ~104 bytes

# Same results, but generator uses 99.999% less memory
total_gen = sum(squares_gen)      # Sum without storing all values
# Use generators for: large files, database results, API pagination