Foundation Guide

NumPy Fundamentals

๐Ÿ“„ 45 Pages ๐ŸŽฏ Stage: Master the Basics ๐Ÿ“ฆ Output: foundation.html

What this page does

Introduces NumPy and what you will learn in this guide.

Where this fits

This is the starting point. You should know basic Python (variables, lists, loops) before starting.

Explanation

NumPy (Numerical Python) is the foundation of scientific computing in Python. By the end of this guide, you will know how to:

  • Create arrays โ€” Build 1D, 2D, and 3D arrays from scratch
  • Access elements โ€” Index and slice arrays efficiently
  • Perform operations โ€” Math on entire arrays at once
  • Reshape data โ€” Transform array dimensions
  • Aggregate values โ€” Sum, mean, min, max across arrays
  • Filter data โ€” Boolean indexing and conditions
NumPy arrays are faster and more memory-efficient than Python lists. Every data science and machine learning library builds on NumPy.

Why this matters

NumPy is not optional. Pandas, Matplotlib, Scikit-learn, TensorFlow โ€” they all use NumPy arrays internally. Learning NumPy is learning the language of scientific Python.

โœ“ Checkpoint

โš  If something breaks here

Nothing to break yet. Move to Page 2.

What this page does

Confirms your environment is ready for NumPy.

Where this fits

Before installing NumPy, verify Python and pip work.

Code (this page)

python3 --version
pip3 --version

Explanation

Open Terminal and run these commands.

You should see:

Python 3.10.x (or higher)
pip 24.x.x (or similar)

If you don't have Python set up, complete the [Project Setup Guide](../project-setup/foundation.html) first.

Why this matters

NumPy requires Python 3.8 or higher. Verifying your setup prevents installation issues.

โœ“ Checkpoint

โš  If something breaks here

  • command not found: Complete the Project Setup guide first
  • Old Python version: Run brew install python to update

What this page does

Installs the NumPy library.

Where this fits

Python is ready. Now add NumPy.

Code (this page)

pip3 install numpy

Explanation

Run pip3 install numpy in Terminal.

You will see:

Collecting numpy
Downloading numpy-1.26.x...
Successfully installed numpy-1.26.x

NumPy is now available for any Python script on your system.

Why this matters

This is a one-time setup. Once installed, NumPy is ready to import in any project.

โœ“ Checkpoint

โš  If something breaks here

  • Permission denied: Use pip3 install numpy --user
  • Already installed: That's fine, continue

What this page does

Confirms NumPy is working correctly.

Where this fits

NumPy is installed. Now verify it imports.

Code (this page)

python3 -c "import numpy; print(numpy.__version__)"

Explanation

This command starts Python, imports NumPy, and prints its version.

You should see:

1.26.4

(Your version may differ slightly.)

Why this matters

If NumPy imports without errors, your installation is correct.

โœ“ Checkpoint

โš  If something breaks here

  • ModuleNotFoundError: NumPy didn't install. Re-run Page 3
  • Version mismatch: Any 1.20+ version works for this guide

What this page does

Sets up a Python file for practicing NumPy.

Where this fits

NumPy is installed. Now create a workspace.

Code (this page)

mkdir ~/projects/numpy-practice
cd ~/projects/numpy-practice
touch numpy_basics.py

Explanation

These commands:

  • Create a folder called numpy-practice
  • Enter that folder
  • Create an empty Python file
From now on, you will write code in numpy_basics.py and run it with python3 numpy_basics.py.

Why this matters

Having a dedicated file lets you build up code incrementally and re-run to test changes.

โœ“ Checkpoint

โš  If something breaks here

  • mkdir: File exists: The folder exists. Just cd into it
  • Permission denied: Check you're in a writable location

What this page does

Shows the standard way to import NumPy.

Where this fits

Your file is ready. Now add the import.

Code (this page)

import numpy as np

print("NumPy imported successfully")

Explanation

Open numpy_basics.py in VS Code and add this code.

import numpy as np is the universal convention. Everyone uses np as the alias. You will see np. everywhere in NumPy code.

Run it:

python3 numpy_basics.py

Output:

NumPy imported successfully

Why this matters

This import line will start every NumPy script you ever write. The np alias saves typing and matches all documentation.

โœ“ Checkpoint

โš  If something breaks here

  • ModuleNotFoundError: NumPy not installed. Return to Page 3
  • Typo: Check spelling of numpy (lowercase)

What this page does

Creates a NumPy array from a Python list.

Where this fits

NumPy is imported. Now create data.

Code (this page)

import numpy as np

numbers = np.array([1, 2, 3, 4, 5]) print(numbers)

Explanation

np.array() converts a Python list into a NumPy array.

Run the file:

[1 2 3 4 5]

Notice: no commas between elements. That's how NumPy displays arrays.

Why this matters

This is the most common way to create arrays. Any Python list becomes an array with np.array().

โœ“ Checkpoint

โš  If something breaks here

  • Square brackets matter: np.array([1,2,3]) not np.array(1,2,3)
  • Forgot np.: You need the alias prefix

What this page does

Shows the difference between NumPy arrays and Python lists.

Where this fits

You created an array. Now understand why it's different.

Code (this page)

import numpy as np

py_list = [1, 2, 3, 4, 5] np_array = np.array([1, 2, 3, 4, 5])

print("List type:", type(py_list)) print("Array type:", type(np_array))

# Multiply by 2 print("List * 2:", py_list * 2) print("Array * 2:", np_array * 2)

Explanation

Run the file:

List type: <class 'list'>
Array type: <class 'numpy.ndarray'>
List * 2: [1, 2, 3, 4, 5, 1, 2, 3, 4, 5]
Array * 2: [ 2  4  6  8 10]
  • List * 2 duplicates the list
  • Array * 2 multiplies each element
This is the power of NumPy: operations apply to every element automatically.

Why this matters

NumPy arrays support vectorized operations โ€” math on entire arrays without loops. This is faster and cleaner than looping through lists.

โœ“ Checkpoint

โš  If something breaks here

Nothing should break. Re-read if the difference isn't clear.

What this page does

Shows how to check an array's data type.

Where this fits

Arrays have a consistent data type. Learn to inspect it.

Code (this page)

import numpy as np

integers = np.array([1, 2, 3]) floats = np.array([1.5, 2.5, 3.5]) mixed = np.array([1, 2.5, 3])

print("Integers dtype:", integers.dtype) print("Floats dtype:", floats.dtype) print("Mixed dtype:", mixed.dtype)

Explanation

Run the file:

Integers dtype: int64
Floats dtype: float64
Mixed dtype: float64

Every element in a NumPy array has the same type. When you mix integers and floats, NumPy converts everything to floats (the more flexible type).

Why this matters

Knowing the dtype helps you understand memory usage and avoid unexpected type conversions.

โœ“ Checkpoint

โš  If something breaks here

Nothing should break. This is informational.

What this page does

Shows how to explicitly set an array's data type.

Where this fits

NumPy infers types automatically. Sometimes you need control.

Code (this page)

import numpy as np

# Force float type as_float = np.array([1, 2, 3], dtype=float) print("As float:", as_float)

# Force integer type as_int = np.array([1.9, 2.9, 3.9], dtype=int) print("As int:", as_int)

Explanation

Run the file:

As float: [1. 2. 3.]
As int: [1 2 3]

The dtype parameter forces the type:

  • Integers become 1. (float with decimal point)
  • Floats become truncated integers (not rounded!)

Why this matters

Sometimes you need specific types for compatibility with other libraries or to reduce memory usage.

โœ“ Checkpoint

โš  If something breaks here

  • Invalid dtype: Use int, float, bool, or NumPy types like np.int32

What this page does

Shows how to check an array's dimensions.

Where this fits

You can create arrays. Now understand their structure.

Code (this page)

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print("Shape:", arr.shape) print("Size:", arr.size) print("Dimensions:", arr.ndim)

Explanation

Run the file:

Shape: (5,)
Size: 5
Dimensions: 1
  • shape: Tuple showing size of each dimension. (5,) means 5 elements in 1 dimension
  • size: Total number of elements
  • ndim: Number of dimensions (1D, 2D, 3D, etc.)

Why this matters

Shape is critical when combining arrays or feeding data to machine learning models. Mismatched shapes cause errors.

โœ“ Checkpoint

โš  If something breaks here

Nothing should break. These are read-only properties.

What this page does

Creates a two-dimensional array (matrix).

Where this fits

You made 1D arrays. Now add a dimension.

Code (this page)

import numpy as np

matrix = np.array([ [1, 2, 3], [4, 5, 6] ])

print(matrix) print("Shape:", matrix.shape)

Explanation

Run the file:

[[1 2 3]
 [4 5 6]]
Shape: (2, 3)

A 2D array is a list of lists. This one has:

  • 2 rows
  • 3 columns
  • Shape (2, 3) โ€” always (rows, columns)

Why this matters

Most real data is 2D: spreadsheets, images (height ร— width), datasets (samples ร— features).

โœ“ Checkpoint

โš  If something breaks here

  • Uneven rows: Each inner list must have the same length

What this page does

Creates a three-dimensional array.

Where this fits

You made 2D arrays. Now understand 3D.

Code (this page)

import numpy as np

cube = np.array([ [[1, 2], [3, 4]], [[5, 6], [7, 8]] ])

print(cube) print("Shape:", cube.shape)

Explanation

Run the file:

[[[1 2]
  [3 4]]

[[5 6] [7 8]]] Shape: (2, 2, 2)

This is a 2ร—2ร—2 cube:

  • 2 "layers"
  • Each layer has 2 rows
  • Each row has 2 elements

Why this matters

3D arrays are common for color images (height ร— width ร— RGB channels) and time series data.

โœ“ Checkpoint

โš  If something breaks here

  • Hard to visualize: Think of it as a stack of 2D arrays

What this page does

Creates arrays filled with zeros.

Where this fits

Creating arrays from lists is manual. NumPy has shortcuts.

Code (this page)

import numpy as np

zeros_1d = np.zeros(5) zeros_2d = np.zeros((3, 4))

print("1D zeros:", zeros_1d) print("2D zeros:") print(zeros_2d)

Explanation

Run the file:

1D zeros: [0. 0. 0. 0. 0.]
2D zeros:
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]

np.zeros() creates arrays filled with 0.0:

  • Pass an integer for 1D: np.zeros(5)
  • Pass a tuple for 2D+: np.zeros((3, 4))

Why this matters

Initialize arrays before filling them. Common pattern: create zeros, then assign values in a loop.

โœ“ Checkpoint

โš  If something breaks here

  • Missing parentheses: 2D needs double parens np.zeros((3, 4)) not np.zeros(3, 4)

What this page does

Creates arrays filled with ones.

Where this fits

Like zeros, but with ones.

Code (this page)

import numpy as np

ones_1d = np.ones(4) ones_2d = np.ones((2, 3))

print("1D ones:", ones_1d) print("2D ones:") print(ones_2d)

Explanation

Run the file:

1D ones: [1. 1. 1. 1.]
2D ones:
[[1. 1. 1.]
 [1. 1. 1.]]

Same syntax as zeros. Useful for:

  • Initializing weights
  • Creating masks
  • Placeholder data

Why this matters

Combined with multiplication, ones let you create any constant array: np.ones(5) * 7 gives [7. 7. 7. 7. 7.]

โœ“ Checkpoint

โš  If something breaks here

Nothing should break. Same pattern as zeros.

What this page does

Creates arrays with sequential numbers.

Where this fits

You made constant arrays. Now make sequences.

Code (this page)

import numpy as np

simple = np.arange(10) with_start = np.arange(5, 10) with_step = np.arange(0, 20, 2)

print("0 to 9:", simple) print("5 to 9:", with_start) print("Even 0-18:", with_step)

Explanation

Run the file:

0 to 9: [0 1 2 3 4 5 6 7 8 9]
5 to 9: [5 6 7 8 9]
Even 0-18: [ 0  2  4  6  8 10 12 14 16 18]

np.arange(start, stop, step):

  • arange(10) โ€” 0 to 9 (stop is exclusive)
  • arange(5, 10) โ€” 5 to 9
  • arange(0, 20, 2) โ€” 0 to 18 by 2s
Like Python's range() but returns an array.

Why this matters

Generate index arrays, test data, or any arithmetic sequence quickly.

โœ“ Checkpoint

โš  If something breaks here

  • Unexpected result: Remember stop value is NOT included

What this page does

Creates arrays with evenly spaced numbers.

Where this fits

arange uses step size. linspace uses number of points.

Code (this page)

import numpy as np

five_points = np.linspace(0, 10, 5) eleven_points = np.linspace(0, 1, 11)

print("5 points from 0-10:", five_points) print("11 points from 0-1:", eleven_points)

Explanation

Run the file:

5 points from 0-10: [ 0.   2.5  5.   7.5 10. ]
11 points from 0-1: [0.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]

np.linspace(start, stop, num):

  • Includes both start AND stop (unlike arange)
  • Divides range into num equal points

Why this matters

Essential for plotting. "Give me 100 points from 0 to 2ฯ€" is np.linspace(0, 2*np.pi, 100).

โœ“ Checkpoint

โš  If something breaks here

  • Unexpected spacing: linspace always includes both endpoints

What this page does

Creates arrays with random numbers.

Where this fits

Sequences are predictable. Sometimes you need randomness.

Code (this page)

import numpy as np

uniform = np.random.rand(5) integers = np.random.randint(1, 100, 5) normal = np.random.randn(5)

print("Uniform 0-1:", uniform) print("Integers 1-99:", integers) print("Normal dist:", normal)

Explanation

Run the file (your numbers will differ):

Uniform 0-1: [0.374 0.951 0.732 0.598 0.156]
Integers 1-99: [42 87 23 64 11]
Normal dist: [-0.234  1.523 -0.891  0.432  0.067]
  • rand(n) โ€” Uniform distribution between 0 and 1
  • randint(low, high, size) โ€” Random integers (high exclusive)
  • randn(n) โ€” Normal distribution (mean=0, std=1)

Why this matters

Testing, simulations, machine learning initialization โ€” random data is everywhere.

โœ“ Checkpoint

โš  If something breaks here

  • Different numbers each run: That's expected (it's random)
  • Reproducible results: Use np.random.seed(42) before generating

What this page does

Verifies you can create arrays multiple ways.

Where this fits

You have completed the array creation section.

Explanation

You should now be able to:

MethodCreates
np.array([...])Array from list
np.zeros((r,c))Array of zeros
np.ones((r,c))Array of ones
np.arange(start, stop, step)Sequential integers
np.linspace(start, stop, num)Evenly spaced floats
np.random.rand(n)Random floats 0-1
np.random.randint(lo, hi, n)Random integers

Why this matters

Array creation is step one of every NumPy workflow. These seven methods cover 90% of use cases.

โœ“ Checkpoint

โš  If something breaks here

Review pages 7-18 for any method you're unsure about.

What this page does

Shows how to access individual elements.

Where this fits

You can create arrays. Now access their data.

Code (this page)

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

print("First element:", arr[0]) print("Third element:", arr[2]) print("Last element:", arr[-1]) print("Second to last:", arr[-2])

Explanation

Run the file:

First element: 10
Third element: 30
Last element: 50
Second to last: 40

Indexing works like Python lists:

  • [0] is first element
  • [-1] is last element
  • Indices start at 0

Why this matters

Accessing specific elements is fundamental. This syntax is used millions of times in NumPy code.

โœ“ Checkpoint

โš  If something breaks here

  • IndexError: Index is out of bounds. Check array size with len(arr)

What this page does

Shows how to access ranges of elements.

Where this fits

You accessed single elements. Now access multiple.

Code (this page)

import numpy as np

arr = np.array([10, 20, 30, 40, 50, 60, 70])

print("First three:", arr[0:3]) print("Index 2 to 5:", arr[2:6]) print("From index 4:", arr[4:]) print("Up to index 4:", arr[:4])

Explanation

Run the file:

First three: [10 20 30]
Index 2 to 5: [30 40 50 60]
From index 4: [50 60 70]
Up to index 4: [10 20 30 40]

Slice syntax: arr[start:stop]

  • Start is included
  • Stop is excluded
  • Omit start = from beginning
  • Omit stop = to end

Why this matters

Slicing extracts subsets without loops. Essential for data manipulation.

โœ“ Checkpoint

โš  If something breaks here

  • Wrong elements: Remember stop is exclusive

What this page does

Shows how to skip elements when slicing.

Where this fits

Basic slices are contiguous. Add step for patterns.

Code (this page)

import numpy as np

arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

print("Every other:", arr[::2]) print("Every third:", arr[::3]) print("Reversed:", arr[::-1]) print("Odd indices:", arr[1::2])

Explanation

Run the file:

Every other: [0 2 4 6 8]
Every third: [0 3 6 9]
Reversed: [9 8 7 6 5 4 3 2 1 0]
Odd indices: [1 3 5 7 9]

Full syntax: arr[start:stop:step]

  • ::2 โ€” every 2nd element
  • ::-1 โ€” reverse the array
  • 1::2 โ€” start at 1, take every 2nd

Why this matters

Extract patterns without loops. Reverse arrays instantly. Common in signal processing and image manipulation.

โœ“ Checkpoint

โš  If something breaks here

Nothing should break. Experiment with different steps.

What this page does

Shows how to access elements in a matrix.

Where this fits

You indexed 1D arrays. Now handle 2D.

Code (this page)

import numpy as np

matrix = np.array([ [1, 2, 3], [4, 5, 6], [7, 8, 9] ])

print("Element at row 0, col 0:", matrix[0, 0]) print("Element at row 1, col 2:", matrix[1, 2]) print("Element at row 2, col 1:", matrix[2, 1])

Explanation

Run the file:

Element at row 0, col 0: 1
Element at row 1, col 2: 6
Element at row 2, col 1: 8

2D indexing: matrix[row, column]

  • Row 0 is the first row [1, 2, 3]
  • Column 2 is the third column
  • matrix[1, 2] = row 1, column 2 = 6

Why this matters

Accessing specific cells is how you read and modify matrix data.

โœ“ Checkpoint

โš  If something breaks here

  • IndexError: Row or column index too large

What this page does

Shows how to extract entire rows from a matrix.

Where this fits

You accessed single cells. Now extract rows.

Code (this page)

import numpy as np

matrix = np.array([ [1, 2, 3], [4, 5, 6], [7, 8, 9] ])

print("Row 0:", matrix[0]) print("Row 1:", matrix[1]) print("Rows 0-1:", matrix[0:2])

Explanation

Run the file:

Row 0: [1 2 3]
Row 1: [4 5 6]
Rows 0-1: [[1 2 3]
 [4 5 6]]
  • matrix[0] โ€” entire first row
  • matrix[0:2] โ€” rows 0 and 1 (2D result)
Think of rows as the first dimension.

Why this matters

Extracting rows is how you select samples from a dataset.

โœ“ Checkpoint

โš  If something breaks here

Nothing should break. Experiment with different row indices.

What this page does

Shows how to extract entire columns from a matrix.

Where this fits

You extracted rows. Now extract columns.

Code (this page)

import numpy as np

matrix = np.array([ [1, 2, 3], [4, 5, 6], [7, 8, 9] ])

print("Column 0:", matrix[:, 0]) print("Column 2:", matrix[:, 2]) print("Columns 0-1:", matrix[:, 0:2])

Explanation

Run the file:

Column 0: [1 4 7]
Column 2: [3 6 9]
Columns 0-1: [[1 2]
 [4 5]
 [7 8]]

The syntax matrix[:, n]:

  • : means "all rows"
  • n is the column index
[:, 0] reads as "all rows, column 0."

Why this matters

Extracting columns is how you select features from a dataset.

โœ“ Checkpoint

โš  If something breaks here

  • Forgot the colon: matrix[0] is a row. matrix[:, 0] is a column

What this page does

Verifies you can access array elements and slices.

Where this fits

You have completed the indexing and slicing section.

Explanation

You should now be able to:

OperationSyntax
Single element (1D)arr[i]
Slice (1D)arr[start:stop:step]
Single element (2D)matrix[row, col]
Entire rowmatrix[row]
Entire columnmatrix[:, col]
Submatrixmatrix[r1:r2, c1:c2]

Why this matters

Indexing is how you read data. You'll use these patterns constantly.

โœ“ Checkpoint

โš  If something breaks here

Review pages 20-25 for any concept you're unsure about.

What this page does

Shows how to add arrays together.

Where this fits

You can access data. Now perform operations.

Code (this page)

import numpy as np

a = np.array([1, 2, 3, 4]) b = np.array([10, 20, 30, 40])

print("a + b:", a + b) print("a + 5:", a + 5)

Explanation

Run the file:

a + b: [11 22 33 44]
a + 5: [6 7 8 9]
  • Array + Array: adds corresponding elements
  • Array + Scalar: adds the scalar to every element
No loops needed. NumPy handles it automatically.

Why this matters

Vectorized operations are the core of NumPy. They're faster than loops and easier to read.

โœ“ Checkpoint

โš  If something breaks here

  • Shape mismatch: Arrays must be the same shape (for now)

What this page does

Shows subtraction, multiplication, and division.

Where this fits

You did addition. Now do the rest.

Code (this page)

import numpy as np

a = np.array([10, 20, 30, 40]) b = np.array([2, 4, 5, 8])

print("a - b:", a - b) print("a * b:", a * b) print("a / b:", a / b) print("a 2:", a 2)

Explanation

Run the file:

a - b: [ 8 16 25 32]
a * b: [ 20  80 150 320]
a / b: [5. 5. 6. 5.]
a ** 2: [ 100  400  900 1600]

All basic math operators work element-wise:

  • - subtraction
  • * multiplication
  • / division
  • ** power

Why this matters

This is how you process data at scale. One line replaces a loop over thousands of elements.

โœ“ Checkpoint

โš  If something breaks here

  • Division by zero: NumPy returns inf or nan, not an error

What this page does

Shows built-in math functions.

Where this fits

Basic operators are limited. NumPy has advanced math.

Code (this page)

import numpy as np

arr = np.array([1, 4, 9, 16, 25])

print("Square root:", np.sqrt(arr)) print("Exponential:", np.exp(np.array([1, 2, 3]))) print("Logarithm:", np.log(np.array([1, 10, 100])))

Explanation

Run the file:

Square root: [1. 2. 3. 4. 5.]
Exponential: [ 2.718  7.389 20.086]
Logarithm: [0.    2.303 4.605]

Common functions:

  • np.sqrt() โ€” square root
  • np.exp() โ€” e^x
  • np.log() โ€” natural log
  • np.sin(), np.cos() โ€” trigonometry
  • np.abs() โ€” absolute value

Why this matters

Scientific computing needs these functions. NumPy applies them to entire arrays at once.

โœ“ Checkpoint

โš  If something breaks here

  • sqrt of negative: Returns nan (not a number)
  • log of zero or negative: Returns -inf or nan

What this page does

Shows how to compare arrays.

Where this fits

Math produces numbers. Comparisons produce booleans.

Code (this page)

import numpy as np

arr = np.array([1, 5, 10, 15, 20])

print("Greater than 8:", arr > 8) print("Equal to 10:", arr == 10) print("Less than or equal 10:", arr <= 10)

Explanation

Run the file:

Greater than 8: [False False  True  True  True]
Equal to 10: [False False  True False False]
Less than or equal 10: [ True  True  True False False]

Comparisons return boolean arrays:

  • >, <, >=, <=, ==, !=
  • Each element is compared independently

Why this matters

Boolean arrays are used for filtering. "Give me all values greater than 10" starts with a comparison.

โœ“ Checkpoint

โš  If something breaks here

Nothing should break. These always work.

What this page does

Shows how to filter arrays using conditions.

Where this fits

You created boolean arrays. Now use them to filter.

Code (this page)

import numpy as np

arr = np.array([5, 12, 8, 21, 3, 15, 7])

# Get elements greater than 10 mask = arr > 10 print("Mask:", mask) print("Filtered:", arr[mask])

# Or in one line print("Direct:", arr[arr > 10])

Explanation

Run the file:

Mask: [False  True False  True False  True False]
Filtered: [12 21 15]
Direct: [12 21 15]

Boolean indexing: arr[boolean_array]

  • Returns only elements where the boolean is True
  • The mask must be the same length as the array

Why this matters

Filtering is essential for data analysis. "Show me all sales above $1000" is boolean indexing.

โœ“ Checkpoint

โš  If something breaks here

  • Wrong length: Boolean array must match original array length

What this page does

Shows how to use AND, OR with boolean arrays.

Where this fits

Single conditions are limiting. Combine them for power.

Code (this page)

import numpy as np

arr = np.array([5, 12, 8, 21, 3, 15, 7])

# AND: both conditions must be true print("Between 5 and 15:", arr[(arr >= 5) & (arr <= 15)])

# OR: either condition can be true

# NOT: invert the condition print("NOT greater than 10:", arr[~(arr > 10)])

Explanation

Run the file:

Between 5 and 15: [ 5 12  8 15  7]
Less than 5 OR greater than 15: [ 3 21]
NOT greater than 10: [5 8 3 7]

Logical operators for arrays:

  • & โ€” AND (both true)
  • ~ โ€” NOT (invert)
Important: Use parentheses around each condition!

Why this matters

Real filters are complex. "Sales between $100-$500 in January" needs AND.

โœ“ Checkpoint

โš  If something breaks here

  • Missing parentheses: arr > 5 & arr < 10 fails. Use (arr > 5) & (arr < 10)

What this page does

Shows how to change array values using conditions.

Where this fits

You filtered arrays. Now modify them conditionally.

Code (this page)

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) print("Original:", arr)

# Set all values greater than 5 to 0 arr[arr > 5] = 0 print("After setting >5 to 0:", arr)

# Reset and try another operation arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) arr[arr % 2 == 0] = -1 # Replace evens with -1 print("Evens replaced:", arr)

Explanation

Run the file:

Original: [ 1  2  3  4  5  6  7  8  9 10]
After setting >5 to 0: [1 2 3 4 5 0 0 0 0 0]
Evens replaced: [ 1 -1  3 -1  5 -1  7 -1  9 -1]

arr[condition] = value modifies elements in place:

  • Only elements where condition is True are changed
  • Original array is modified (not a copy)

Why this matters

Data cleaning often requires conditional modification. "Replace all negative values with 0" is one line.

โœ“ Checkpoint

โš  If something breaks here

Nothing should break. Be careful: changes are permanent.

What this page does

Shows how to aggregate array values.

Where this fits

You filtered data. Now summarize it.

Code (this page)

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

print("Sum:", np.sum(arr)) print("Mean:", np.mean(arr)) print("Also sum:", arr.sum()) print("Also mean:", arr.mean())

Explanation

Run the file:

Sum: 150
Mean: 30.0
Also sum: 150
Also mean: 30.0

Two ways to call aggregation:

  • np.sum(arr) โ€” function style
  • arr.sum() โ€” method style
Both work identically. Choose based on readability.

Why this matters

Aggregation answers questions: "What's the total sales?" "What's the average score?"

โœ“ Checkpoint

โš  If something breaks here

Nothing should break. These always work on numeric arrays.

What this page does

Shows additional aggregation functions.

Where this fits

Sum and mean are common. There's more.

Code (this page)

import numpy as np

arr = np.array([45, 12, 78, 34, 56, 89, 23])

print("Minimum:", arr.min()) print("Maximum:", arr.max()) print("Index of min:", arr.argmin()) print("Index of max:", arr.argmax()) print("Standard deviation:", arr.std())

Explanation

Run the file:

Minimum: 12
Maximum: 89
Index of min: 1
Index of max: 5
Standard deviation: 25.21...

Key aggregations:

  • min(), max() โ€” smallest/largest value
  • argmin(), argmax() โ€” INDEX of smallest/largest
  • std() โ€” standard deviation
  • var() โ€” variance

Why this matters

Finding extremes and spread tells you about your data distribution.

โœ“ Checkpoint

โš  If something breaks here

Nothing should break. These always work on numeric arrays.

What this page does

Shows how to aggregate rows or columns separately.

Where this fits

1D aggregation sums everything. 2D needs more control.

Code (this page)

import numpy as np

matrix = np.array([ [1, 2, 3], [4, 5, 6], [7, 8, 9] ])

print("Matrix:\n", matrix) print("Sum all:", matrix.sum()) print("Sum each column:", matrix.sum(axis=0)) print("Sum each row:", matrix.sum(axis=1))

Explanation

Run the file:

Matrix:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Sum all: 45
Sum each column: [12 15 18]
Sum each row: [ 6 15 24]

The axis parameter:

  • No axis: aggregate everything
  • axis=0: aggregate down columns (collapse rows)
  • axis=1: aggregate across rows (collapse columns)

Why this matters

Datasets have rows (samples) and columns (features). You often need totals or means per column.

โœ“ Checkpoint

โš  If something breaks here

  • Confusing axis: 0 = down, 1 = across. Remember: "axis 0 disappears rows."

What this page does

Shows how to change an array's dimensions.

Where this fits

You created and aggregated arrays. Now transform their shape.

Code (this page)

import numpy as np

arr = np.arange(12) print("Original:", arr) print("Shape:", arr.shape)

reshaped = arr.reshape(3, 4) print("\nReshaped to 3x4:") print(reshaped)

reshaped2 = arr.reshape(4, 3) print("\nReshaped to 4x3:") print(reshaped2)

Explanation

Run the file:

Original: [ 0  1  2  3  4  5  6  7  8  9 10 11]
Shape: (12,)

Reshaped to 3x4: [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]]

Reshaped to 4x3: [[ 0 1 2] [ 3 4 5] [ 6 7 8] [ 9 10 11]]

reshape(rows, cols) transforms dimensions:

  • Total elements must stay the same (12 = 3ร—4 = 4ร—3)
  • Data fills row by row

Why this matters

Machine learning often requires specific input shapes. Reshape transforms your data to fit.

โœ“ Checkpoint

โš  If something breaks here

  • cannot reshape: The product of new dimensions must equal original size

What this page does

Shows automatic dimension calculation.

Where this fits

You manually specified both dimensions. Let NumPy infer one.

Code (this page)

import numpy as np

arr = np.arange(24)

# Let NumPy figure out the number of columns reshaped = arr.reshape(6, -1) print("6 rows, auto columns:") print(reshaped) print("Shape:", reshaped.shape)

# Let NumPy figure out the number of rows reshaped2 = arr.reshape(-1, 8) print("\nAuto rows, 8 columns:") print(reshaped2) print("Shape:", reshaped2.shape)

Explanation

Run the file:

6 rows, auto columns:
[[ 0  1  2  3]
 [ 4  5  6  7]
 ...
Shape: (6, 4)

Auto rows, 8 columns: [[ 0 1 2 3 4 5 6 7] ... Shape: (3, 8)

-1 means "calculate this dimension automatically":

  • reshape(6, -1) with 24 elements โ†’ (6, 4) because 24รท6=4
  • reshape(-1, 8) with 24 elements โ†’ (3, 8) because 24รท8=3

Why this matters

When you know one dimension but not the other, use -1 and let NumPy do the math.

โœ“ Checkpoint

โš  If something breaks here

  • Multiple -1: Only one dimension can be inferred

What this page does

Shows how to convert any array to 1D.

Where this fits

Reshape adds dimensions. Flatten removes them.

Code (this page)

import numpy as np

matrix = np.array([ [1, 2, 3], [4, 5, 6] ])

print("Original shape:", matrix.shape) print("Flattened:", matrix.flatten()) print("Raveled:", matrix.ravel()) print("Reshaped to 1D:", matrix.reshape(-1))

Explanation

Run the file:

Original shape: (2, 3)
Flattened: [1 2 3 4 5 6]
Raveled: [1 2 3 4 5 6]
Reshaped to 1D: [1 2 3 4 5 6]

Three ways to flatten:

  • flatten() โ€” returns a copy
  • ravel() โ€” returns a view (faster, but changes affect original)
  • reshape(-1) โ€” equivalent to ravel

Why this matters

Sometimes you need all elements in a single row. Flattening is common for passing data to functions.

โœ“ Checkpoint

โš  If something breaks here

Nothing should break. All three methods work.

What this page does

Shows how to swap rows and columns.

Where this fits

Reshape changes dimensions. Transpose swaps them.

Code (this page)

import numpy as np

matrix = np.array([ [1, 2, 3], [4, 5, 6] ])

print("Original (2x3):") print(matrix)

print("\nTransposed (3x2):") print(matrix.T) print("Also transposed:") print(np.transpose(matrix))

Explanation

Run the file:

Original (2x3):
[[1 2 3]
 [4 5 6]]

Transposed (3x2): [[1 4] [2 5] [3 6]] Also transposed: [[1 4] [2 5] [3 6]]

Transpose flips the matrix:

  • Rows become columns
  • Columns become rows
  • (2, 3) becomes (3, 2)

Why this matters

Linear algebra operations often require transposition. Matrix multiplication, least squares, and more use .T.

โœ“ Checkpoint

โš  If something breaks here

Nothing should break. Transpose always works on 2D+ arrays.

What this page does

Shows how to combine arrays.

Where this fits

You reshaped arrays. Now join them.

Code (this page)

import numpy as np

a = np.array([1, 2, 3]) b = np.array([4, 5, 6])

print("Concatenate 1D:", np.concatenate([a, b]))

# 2D arrays x = np.array([[1, 2], [3, 4]]) y = np.array([[5, 6], [7, 8]])

print("\nStack vertically (more rows):") print(np.concatenate([x, y], axis=0))

print("\nStack horizontally (more columns):") print(np.concatenate([x, y], axis=1))

Explanation

Run the file:

Concatenate 1D: [1 2 3 4 5 6]

Stack vertically (more rows): [[1 2] [3 4] [5 6] [7 8]]

Stack horizontally (more columns): [[1 2 5 6] [3 4 7 8]]

np.concatenate([arrays], axis):

  • axis=0: stack vertically (add rows)
  • axis=1: stack horizontally (add columns)

Why this matters

Building datasets often means combining arrays from different sources.

โœ“ Checkpoint

โš  If something breaks here

  • Shape mismatch: Arrays must have compatible shapes on the join axis

What this page does

Shows alternative ways to combine arrays.

Where this fits

Concatenate joins. Stack adds dimensions.

Code (this page)

import numpy as np

a = np.array([1, 2, 3]) b = np.array([4, 5, 6])

print("vstack (vertical):") print(np.vstack([a, b]))

print("\nhstack (horizontal):") print(np.hstack([a, b]))

print("\nstack (new dimension):") print(np.stack([a, b])) print("Shape:", np.stack([a, b]).shape)

Explanation

Run the file:

vstack (vertical):
[[1 2 3]
 [4 5 6]]

hstack (horizontal): [1 2 3 4 5 6]

stack (new dimension): [[1 2 3] [4 5 6]] Shape: (2, 3)

Convenience functions:

  • vstack: vertical stack (add rows)
  • hstack: horizontal stack (add columns or extend 1D)
  • stack: create a NEW dimension

Why this matters

vstack and hstack are more readable than concatenate with axis.

โœ“ Checkpoint

โš  If something breaks here

  • Shape mismatch: Arrays must have compatible shapes

What this page does

Explains when changes affect the original array.

Where this fits

You've modified arrays. Understand when copies are made.

Code (this page)

import numpy as np

original = np.array([1, 2, 3, 4, 5])

# Slicing creates a VIEW view = original[1:4] view[0] = 99 print("After modifying view:") print("Original:", original) print("View:", view)

# copy() creates a COPY original = np.array([1, 2, 3, 4, 5]) copy = original[1:4].copy() copy[0] = 99 print("\nAfter modifying copy:") print("Original:", original) print("Copy:", copy)

Explanation

Run the file:

After modifying view:
Original: [ 1 99  3  4  5]
View: [99  3  4]

After modifying copy: Original: [1 2 3 4 5] Copy: [99 3 4]

Critical concept:

  • View: Points to original data. Changes affect both.
  • Copy: Independent data. Changes are isolated.
Slices are views by default. Use .copy() when you need independence.

Why this matters

This is a common source of bugs. Knowing when you have a view prevents accidental data corruption.

โœ“ Checkpoint

โš  If something breaks here

Nothing breaks, but unexpected changes happen if you forget about views.

What this page does

Shows conditional element selection.

Where this fits

Boolean indexing filters. Where transforms.

Code (this page)

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Replace based on condition result = np.where(arr > 5, arr, 0) print("Keep if >5, else 0:", result)

# Different values for true/false result2 = np.where(arr % 2 == 0, "even", "odd") print("Even or odd:", result2)

Explanation

Run the file:

Keep if >5, else 0: [ 0  0  0  0  0  6  7  8  9 10]
Even or odd: ['odd' 'even' 'odd' 'even' 'odd' 'even' 'odd' 'even' 'odd' 'even']

np.where(condition, if_true, if_false):

  • Evaluates condition for each element
  • Returns if_true where True, if_false where False
  • Like a vectorized if-else

Why this matters

Where is cleaner than boolean indexing when you need both branches (true AND false cases).

โœ“ Checkpoint

โš  If something breaks here

  • Shape mismatch: if_true and if_false must broadcast to the condition shape

What this page does

Confirms you have mastered all skills in this guide.

Where this fits

This is the end. Verify everything works together.

Explanation

Complete this final test in your numpy_basics.py:

import numpy as np

# Create data data = np.random.randint(0, 100, 20) print("Data:", data)

# Reshape to 4x5 matrix = data.reshape(4, 5) print("\nAs 4x5 matrix:") print(matrix)

# Column means print("\nColumn means:", matrix.mean(axis=0))

# Filter values above average avg = data.mean() print(f"\nAverage: {avg:.2f}") print("Above average:", data[data > avg])

# Replace below average with 0 cleaned = np.where(data >= avg, data, 0) print("Cleaned:", cleaned)

print("\nโœ“ NumPy Fundamentals Complete!")

Run it. If you understand every line, you've completed the guide.

Why this matters

You now have the foundation for data science in Python. NumPy skills unlock Pandas, Matplotlib, Scikit-learn, and beyond.

โœ“ Checkpoint

โš  If something breaks here

  • Review the specific section where the error occurred
  • Each skill builds on the previous one

Contents