What this page does
Introduces NumPy and what you will learn in this guide.
Where this fits
This is the starting point. You should know basic Python (variables, lists, loops) before starting.
Explanation
NumPy (Numerical Python) is the foundation of scientific computing in Python. By the end of this guide, you will know how to:
- Create arrays โ Build 1D, 2D, and 3D arrays from scratch
- Access elements โ Index and slice arrays efficiently
- Perform operations โ Math on entire arrays at once
- Reshape data โ Transform array dimensions
- Aggregate values โ Sum, mean, min, max across arrays
- Filter data โ Boolean indexing and conditions
NumPy arrays are faster and more memory-efficient than Python lists. Every data science and machine learning library builds on NumPy.
Why this matters
NumPy is not optional. Pandas, Matplotlib, Scikit-learn, TensorFlow โ they all use NumPy arrays internally. Learning NumPy is learning the language of scientific Python.
โ If something breaks here
Nothing to break yet. Move to Page 2.
What this page does
Confirms your environment is ready for NumPy.
Where this fits
Before installing NumPy, verify Python and pip work.
Code (this page)
python3 --version
pip3 --version
Explanation
Open Terminal and run these commands.
You should see:
Python 3.10.x (or higher)
pip 24.x.x (or similar)
If you don't have Python set up, complete the [Project Setup Guide](../project-setup/foundation.html) first.
Why this matters
NumPy requires Python 3.8 or higher. Verifying your setup prevents installation issues.
โ If something breaks here
command not found: Complete the Project Setup guide first
- Old Python version: Run
brew install python to update
What this page does
Installs the NumPy library.
Where this fits
Python is ready. Now add NumPy.
Code (this page)
Explanation
Run pip3 install numpy in Terminal.
You will see:
Collecting numpy
Downloading numpy-1.26.x...
Successfully installed numpy-1.26.x
NumPy is now available for any Python script on your system.
Why this matters
This is a one-time setup. Once installed, NumPy is ready to import in any project.
โ If something breaks here
Permission denied: Use pip3 install numpy --user
- Already installed: That's fine, continue
What this page does
Confirms NumPy is working correctly.
Where this fits
NumPy is installed. Now verify it imports.
Code (this page)
python3 -c "import numpy; print(numpy.__version__)"
Explanation
This command starts Python, imports NumPy, and prints its version.
You should see:
1.26.4
(Your version may differ slightly.)
Why this matters
If NumPy imports without errors, your installation is correct.
โ If something breaks here
ModuleNotFoundError: NumPy didn't install. Re-run Page 3
- Version mismatch: Any 1.20+ version works for this guide
What this page does
Sets up a Python file for practicing NumPy.
Where this fits
NumPy is installed. Now create a workspace.
Code (this page)
mkdir ~/projects/numpy-practice
cd ~/projects/numpy-practice
touch numpy_basics.py
Explanation
These commands:
- Create a folder called
numpy-practice
- Enter that folder
- Create an empty Python file
From now on, you will write code in
numpy_basics.py and run it with
python3 numpy_basics.py.
Why this matters
Having a dedicated file lets you build up code incrementally and re-run to test changes.
โ If something breaks here
mkdir: File exists: The folder exists. Just cd into it
- Permission denied: Check you're in a writable location
What this page does
Shows the standard way to import NumPy.
Where this fits
Your file is ready. Now add the import.
Code (this page)
import numpy as npprint("NumPy imported successfully")
Explanation
Open numpy_basics.py in VS Code and add this code.
import numpy as np is the universal convention. Everyone uses np as the alias. You will see np. everywhere in NumPy code.
Run it:
python3 numpy_basics.py
Output:
NumPy imported successfully
Why this matters
This import line will start every NumPy script you ever write. The np alias saves typing and matches all documentation.
โ If something breaks here
ModuleNotFoundError: NumPy not installed. Return to Page 3
- Typo: Check spelling of
numpy (lowercase)
What this page does
Creates a NumPy array from a Python list.
Where this fits
NumPy is imported. Now create data.
Code (this page)
import numpy as npnumbers = np.array([1, 2, 3, 4, 5])
print(numbers)
Explanation
np.array() converts a Python list into a NumPy array.
Run the file:
[1 2 3 4 5]
Notice: no commas between elements. That's how NumPy displays arrays.
Why this matters
This is the most common way to create arrays. Any Python list becomes an array with np.array().
โ If something breaks here
- Square brackets matter:
np.array([1,2,3]) not np.array(1,2,3)
- Forgot
np.: You need the alias prefix
What this page does
Shows the difference between NumPy arrays and Python lists.
Where this fits
You created an array. Now understand why it's different.
Code (this page)
import numpy as nppy_list = [1, 2, 3, 4, 5]
np_array = np.array([1, 2, 3, 4, 5])
print("List type:", type(py_list))
print("Array type:", type(np_array))
# Multiply by 2
print("List * 2:", py_list * 2)
print("Array * 2:", np_array * 2)
Explanation
Run the file:
List type: <class 'list'>
Array type: <class 'numpy.ndarray'>
List * 2: [1, 2, 3, 4, 5, 1, 2, 3, 4, 5]
Array * 2: [ 2 4 6 8 10]
- List
* 2 duplicates the list
- Array
* 2 multiplies each element
This is the power of NumPy: operations apply to every element automatically.
Why this matters
NumPy arrays support vectorized operations โ math on entire arrays without loops. This is faster and cleaner than looping through lists.
โ If something breaks here
Nothing should break. Re-read if the difference isn't clear.
What this page does
Shows how to check an array's data type.
Where this fits
Arrays have a consistent data type. Learn to inspect it.
Code (this page)
import numpy as npintegers = np.array([1, 2, 3])
floats = np.array([1.5, 2.5, 3.5])
mixed = np.array([1, 2.5, 3])
print("Integers dtype:", integers.dtype)
print("Floats dtype:", floats.dtype)
print("Mixed dtype:", mixed.dtype)
Explanation
Run the file:
Integers dtype: int64
Floats dtype: float64
Mixed dtype: float64
Every element in a NumPy array has the same type. When you mix integers and floats, NumPy converts everything to floats (the more flexible type).
Why this matters
Knowing the dtype helps you understand memory usage and avoid unexpected type conversions.
โ If something breaks here
Nothing should break. This is informational.
What this page does
Shows how to explicitly set an array's data type.
Where this fits
NumPy infers types automatically. Sometimes you need control.
Code (this page)
import numpy as np# Force float type
as_float = np.array([1, 2, 3], dtype=float)
print("As float:", as_float)
# Force integer type
as_int = np.array([1.9, 2.9, 3.9], dtype=int)
print("As int:", as_int)
Explanation
Run the file:
As float: [1. 2. 3.]
As int: [1 2 3]
The dtype parameter forces the type:
- Integers become
1. (float with decimal point)
- Floats become truncated integers (not rounded!)
Why this matters
Sometimes you need specific types for compatibility with other libraries or to reduce memory usage.
โ If something breaks here
- Invalid dtype: Use
int, float, bool, or NumPy types like np.int32
What this page does
Shows how to check an array's dimensions.
Where this fits
You can create arrays. Now understand their structure.
Code (this page)
import numpy as nparr = np.array([1, 2, 3, 4, 5])
print("Shape:", arr.shape)
print("Size:", arr.size)
print("Dimensions:", arr.ndim)
Explanation
Run the file:
Shape: (5,)
Size: 5
Dimensions: 1
- shape: Tuple showing size of each dimension.
(5,) means 5 elements in 1 dimension
- size: Total number of elements
- ndim: Number of dimensions (1D, 2D, 3D, etc.)
Why this matters
Shape is critical when combining arrays or feeding data to machine learning models. Mismatched shapes cause errors.
โ If something breaks here
Nothing should break. These are read-only properties.
What this page does
Creates a two-dimensional array (matrix).
Where this fits
You made 1D arrays. Now add a dimension.
Code (this page)
import numpy as npmatrix = np.array([
[1, 2, 3],
[4, 5, 6]
])
print(matrix)
print("Shape:", matrix.shape)
Explanation
Run the file:
[[1 2 3]
[4 5 6]]
Shape: (2, 3)
A 2D array is a list of lists. This one has:
- 2 rows
- 3 columns
- Shape
(2, 3) โ always (rows, columns)
Why this matters
Most real data is 2D: spreadsheets, images (height ร width), datasets (samples ร features).
โ If something breaks here
- Uneven rows: Each inner list must have the same length
What this page does
Creates a three-dimensional array.
Where this fits
You made 2D arrays. Now understand 3D.
Code (this page)
import numpy as npcube = np.array([
[[1, 2], [3, 4]],
[[5, 6], [7, 8]]
])
print(cube)
print("Shape:", cube.shape)
Explanation
Run the file:
[[[1 2]
[3 4]] [[5 6]
[7 8]]]
Shape: (2, 2, 2)
This is a 2ร2ร2 cube:
- 2 "layers"
- Each layer has 2 rows
- Each row has 2 elements
Why this matters
3D arrays are common for color images (height ร width ร RGB channels) and time series data.
โ If something breaks here
- Hard to visualize: Think of it as a stack of 2D arrays
What this page does
Creates arrays filled with zeros.
Where this fits
Creating arrays from lists is manual. NumPy has shortcuts.
Code (this page)
import numpy as npzeros_1d = np.zeros(5)
zeros_2d = np.zeros((3, 4))
print("1D zeros:", zeros_1d)
print("2D zeros:")
print(zeros_2d)
Explanation
Run the file:
1D zeros: [0. 0. 0. 0. 0.]
2D zeros:
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]
np.zeros() creates arrays filled with 0.0:
- Pass an integer for 1D:
np.zeros(5)
- Pass a tuple for 2D+:
np.zeros((3, 4))
Why this matters
Initialize arrays before filling them. Common pattern: create zeros, then assign values in a loop.
โ If something breaks here
- Missing parentheses: 2D needs double parens
np.zeros((3, 4)) not np.zeros(3, 4)
What this page does
Creates arrays filled with ones.
Where this fits
Like zeros, but with ones.
Code (this page)
import numpy as npones_1d = np.ones(4)
ones_2d = np.ones((2, 3))
print("1D ones:", ones_1d)
print("2D ones:")
print(ones_2d)
Explanation
Run the file:
1D ones: [1. 1. 1. 1.]
2D ones:
[[1. 1. 1.]
[1. 1. 1.]]
Same syntax as zeros. Useful for:
- Initializing weights
- Creating masks
- Placeholder data
Why this matters
Combined with multiplication, ones let you create any constant array: np.ones(5) * 7 gives [7. 7. 7. 7. 7.]
โ If something breaks here
Nothing should break. Same pattern as zeros.
What this page does
Creates arrays with sequential numbers.
Where this fits
You made constant arrays. Now make sequences.
Code (this page)
import numpy as npsimple = np.arange(10)
with_start = np.arange(5, 10)
with_step = np.arange(0, 20, 2)
print("0 to 9:", simple)
print("5 to 9:", with_start)
print("Even 0-18:", with_step)
Explanation
Run the file:
0 to 9: [0 1 2 3 4 5 6 7 8 9]
5 to 9: [5 6 7 8 9]
Even 0-18: [ 0 2 4 6 8 10 12 14 16 18]
np.arange(start, stop, step):
arange(10) โ 0 to 9 (stop is exclusive)
arange(5, 10) โ 5 to 9
arange(0, 20, 2) โ 0 to 18 by 2s
Like Python's
range() but returns an array.
Why this matters
Generate index arrays, test data, or any arithmetic sequence quickly.
โ If something breaks here
- Unexpected result: Remember stop value is NOT included
What this page does
Creates arrays with evenly spaced numbers.
Where this fits
arange uses step size. linspace uses number of points.
Code (this page)
import numpy as npfive_points = np.linspace(0, 10, 5)
eleven_points = np.linspace(0, 1, 11)
print("5 points from 0-10:", five_points)
print("11 points from 0-1:", eleven_points)
Explanation
Run the file:
5 points from 0-10: [ 0. 2.5 5. 7.5 10. ]
11 points from 0-1: [0. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]
np.linspace(start, stop, num):
- Includes both start AND stop (unlike arange)
- Divides range into
num equal points
Why this matters
Essential for plotting. "Give me 100 points from 0 to 2ฯ" is np.linspace(0, 2*np.pi, 100).
โ If something breaks here
- Unexpected spacing: linspace always includes both endpoints
What this page does
Creates arrays with random numbers.
Where this fits
Sequences are predictable. Sometimes you need randomness.
Code (this page)
import numpy as npuniform = np.random.rand(5)
integers = np.random.randint(1, 100, 5)
normal = np.random.randn(5)
print("Uniform 0-1:", uniform)
print("Integers 1-99:", integers)
print("Normal dist:", normal)
Explanation
Run the file (your numbers will differ):
Uniform 0-1: [0.374 0.951 0.732 0.598 0.156]
Integers 1-99: [42 87 23 64 11]
Normal dist: [-0.234 1.523 -0.891 0.432 0.067]
rand(n) โ Uniform distribution between 0 and 1
randint(low, high, size) โ Random integers (high exclusive)
randn(n) โ Normal distribution (mean=0, std=1)
Why this matters
Testing, simulations, machine learning initialization โ random data is everywhere.
โ If something breaks here
- Different numbers each run: That's expected (it's random)
- Reproducible results: Use
np.random.seed(42) before generating
What this page does
Verifies you can create arrays multiple ways.
Where this fits
You have completed the array creation section.
Explanation
You should now be able to:
| Method | Creates |
np.array([...]) | Array from list |
np.zeros((r,c)) | Array of zeros |
np.ones((r,c)) | Array of ones |
np.arange(start, stop, step) | Sequential integers |
np.linspace(start, stop, num) | Evenly spaced floats |
np.random.rand(n) | Random floats 0-1 |
np.random.randint(lo, hi, n) | Random integers |
Why this matters
Array creation is step one of every NumPy workflow. These seven methods cover 90% of use cases.
โ If something breaks here
Review pages 7-18 for any method you're unsure about.
What this page does
Shows how to access individual elements.
Where this fits
You can create arrays. Now access their data.
Code (this page)
import numpy as nparr = np.array([10, 20, 30, 40, 50])
print("First element:", arr[0])
print("Third element:", arr[2])
print("Last element:", arr[-1])
print("Second to last:", arr[-2])
Explanation
Run the file:
First element: 10
Third element: 30
Last element: 50
Second to last: 40
Indexing works like Python lists:
[0] is first element
[-1] is last element
- Indices start at 0
Why this matters
Accessing specific elements is fundamental. This syntax is used millions of times in NumPy code.
โ If something breaks here
IndexError: Index is out of bounds. Check array size with len(arr)
What this page does
Shows how to access ranges of elements.
Where this fits
You accessed single elements. Now access multiple.
Code (this page)
import numpy as nparr = np.array([10, 20, 30, 40, 50, 60, 70])
print("First three:", arr[0:3])
print("Index 2 to 5:", arr[2:6])
print("From index 4:", arr[4:])
print("Up to index 4:", arr[:4])
Explanation
Run the file:
First three: [10 20 30]
Index 2 to 5: [30 40 50 60]
From index 4: [50 60 70]
Up to index 4: [10 20 30 40]
Slice syntax: arr[start:stop]
- Start is included
- Stop is excluded
- Omit start = from beginning
- Omit stop = to end
Why this matters
Slicing extracts subsets without loops. Essential for data manipulation.
โ If something breaks here
- Wrong elements: Remember stop is exclusive
What this page does
Shows how to skip elements when slicing.
Where this fits
Basic slices are contiguous. Add step for patterns.
Code (this page)
import numpy as nparr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
print("Every other:", arr[::2])
print("Every third:", arr[::3])
print("Reversed:", arr[::-1])
print("Odd indices:", arr[1::2])
Explanation
Run the file:
Every other: [0 2 4 6 8]
Every third: [0 3 6 9]
Reversed: [9 8 7 6 5 4 3 2 1 0]
Odd indices: [1 3 5 7 9]
Full syntax: arr[start:stop:step]
::2 โ every 2nd element
::-1 โ reverse the array
1::2 โ start at 1, take every 2nd
Why this matters
Extract patterns without loops. Reverse arrays instantly. Common in signal processing and image manipulation.
โ If something breaks here
Nothing should break. Experiment with different steps.
What this page does
Shows how to access elements in a matrix.
Where this fits
You indexed 1D arrays. Now handle 2D.
Code (this page)
import numpy as npmatrix = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
print("Element at row 0, col 0:", matrix[0, 0])
print("Element at row 1, col 2:", matrix[1, 2])
print("Element at row 2, col 1:", matrix[2, 1])
Explanation
Run the file:
Element at row 0, col 0: 1
Element at row 1, col 2: 6
Element at row 2, col 1: 8
2D indexing: matrix[row, column]
- Row 0 is the first row
[1, 2, 3]
- Column 2 is the third column
matrix[1, 2] = row 1, column 2 = 6
Why this matters
Accessing specific cells is how you read and modify matrix data.
โ If something breaks here
IndexError: Row or column index too large
What this page does
Shows how to extract entire rows from a matrix.
Where this fits
You accessed single cells. Now extract rows.
Code (this page)
import numpy as npmatrix = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
print("Row 0:", matrix[0])
print("Row 1:", matrix[1])
print("Rows 0-1:", matrix[0:2])
Explanation
Run the file:
Row 0: [1 2 3]
Row 1: [4 5 6]
Rows 0-1: [[1 2 3]
[4 5 6]]
matrix[0] โ entire first row
matrix[0:2] โ rows 0 and 1 (2D result)
Think of rows as the first dimension.
Why this matters
Extracting rows is how you select samples from a dataset.
โ If something breaks here
Nothing should break. Experiment with different row indices.
What this page does
Shows how to extract entire columns from a matrix.
Where this fits
You extracted rows. Now extract columns.
Code (this page)
import numpy as npmatrix = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
print("Column 0:", matrix[:, 0])
print("Column 2:", matrix[:, 2])
print("Columns 0-1:", matrix[:, 0:2])
Explanation
Run the file:
Column 0: [1 4 7]
Column 2: [3 6 9]
Columns 0-1: [[1 2]
[4 5]
[7 8]]
The syntax matrix[:, n]:
: means "all rows"
n is the column index
[:, 0] reads as "all rows, column 0."
Why this matters
Extracting columns is how you select features from a dataset.
โ If something breaks here
- Forgot the colon:
matrix[0] is a row. matrix[:, 0] is a column
What this page does
Verifies you can access array elements and slices.
Where this fits
You have completed the indexing and slicing section.
Explanation
You should now be able to:
| Operation | Syntax |
| Single element (1D) | arr[i] |
| Slice (1D) | arr[start:stop:step] |
| Single element (2D) | matrix[row, col] |
| Entire row | matrix[row] |
| Entire column | matrix[:, col] |
| Submatrix | matrix[r1:r2, c1:c2] |
Why this matters
Indexing is how you read data. You'll use these patterns constantly.
โ If something breaks here
Review pages 20-25 for any concept you're unsure about.
What this page does
Shows how to add arrays together.
Where this fits
You can access data. Now perform operations.
Code (this page)
import numpy as npa = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30, 40])
print("a + b:", a + b)
print("a + 5:", a + 5)
Explanation
Run the file:
a + b: [11 22 33 44]
a + 5: [6 7 8 9]
- Array + Array: adds corresponding elements
- Array + Scalar: adds the scalar to every element
No loops needed. NumPy handles it automatically.
Why this matters
Vectorized operations are the core of NumPy. They're faster than loops and easier to read.
โ If something breaks here
- Shape mismatch: Arrays must be the same shape (for now)
What this page does
Shows subtraction, multiplication, and division.
Where this fits
You did addition. Now do the rest.
Code (this page)
import numpy as npa = np.array([10, 20, 30, 40])
b = np.array([2, 4, 5, 8])
print("a - b:", a - b)
print("a * b:", a * b)
print("a / b:", a / b)
print("a 2:", a 2)
Explanation
Run the file:
a - b: [ 8 16 25 32]
a * b: [ 20 80 150 320]
a / b: [5. 5. 6. 5.]
a ** 2: [ 100 400 900 1600]
All basic math operators work element-wise:
- subtraction
* multiplication
/ division
** power
Why this matters
This is how you process data at scale. One line replaces a loop over thousands of elements.
โ If something breaks here
- Division by zero: NumPy returns
inf or nan, not an error
What this page does
Shows built-in math functions.
Where this fits
Basic operators are limited. NumPy has advanced math.
Code (this page)
import numpy as nparr = np.array([1, 4, 9, 16, 25])
print("Square root:", np.sqrt(arr))
print("Exponential:", np.exp(np.array([1, 2, 3])))
print("Logarithm:", np.log(np.array([1, 10, 100])))
Explanation
Run the file:
Square root: [1. 2. 3. 4. 5.]
Exponential: [ 2.718 7.389 20.086]
Logarithm: [0. 2.303 4.605]
Common functions:
np.sqrt() โ square root
np.exp() โ e^x
np.log() โ natural log
np.sin(), np.cos() โ trigonometry
np.abs() โ absolute value
Why this matters
Scientific computing needs these functions. NumPy applies them to entire arrays at once.
โ If something breaks here
sqrt of negative: Returns nan (not a number)
log of zero or negative: Returns -inf or nan
What this page does
Shows how to compare arrays.
Where this fits
Math produces numbers. Comparisons produce booleans.
Code (this page)
import numpy as nparr = np.array([1, 5, 10, 15, 20])
print("Greater than 8:", arr > 8)
print("Equal to 10:", arr == 10)
print("Less than or equal 10:", arr <= 10)
Explanation
Run the file:
Greater than 8: [False False True True True]
Equal to 10: [False False True False False]
Less than or equal 10: [ True True True False False]
Comparisons return boolean arrays:
>, <, >=, <=, ==, !=
- Each element is compared independently
Why this matters
Boolean arrays are used for filtering. "Give me all values greater than 10" starts with a comparison.
โ If something breaks here
Nothing should break. These always work.
What this page does
Shows how to filter arrays using conditions.
Where this fits
You created boolean arrays. Now use them to filter.
Code (this page)
import numpy as nparr = np.array([5, 12, 8, 21, 3, 15, 7])
# Get elements greater than 10
mask = arr > 10
print("Mask:", mask)
print("Filtered:", arr[mask])
# Or in one line
print("Direct:", arr[arr > 10])
Explanation
Run the file:
Mask: [False True False True False True False]
Filtered: [12 21 15]
Direct: [12 21 15]
Boolean indexing: arr[boolean_array]
- Returns only elements where the boolean is True
- The mask must be the same length as the array
Why this matters
Filtering is essential for data analysis. "Show me all sales above $1000" is boolean indexing.
โ If something breaks here
- Wrong length: Boolean array must match original array length
What this page does
Shows how to use AND, OR with boolean arrays.
Where this fits
Single conditions are limiting. Combine them for power.
Code (this page)
import numpy as nparr = np.array([5, 12, 8, 21, 3, 15, 7])
# AND: both conditions must be true
print("Between 5 and 15:", arr[(arr >= 5) & (arr <= 15)])
# OR: either condition can be true
# NOT: invert the condition
print("NOT greater than 10:", arr[~(arr > 10)])
Explanation
Run the file:
Between 5 and 15: [ 5 12 8 15 7]
Less than 5 OR greater than 15: [ 3 21]
NOT greater than 10: [5 8 3 7]
Logical operators for arrays:
& โ AND (both true)
~ โ NOT (invert)
Important: Use parentheses around each condition!
Why this matters
Real filters are complex. "Sales between $100-$500 in January" needs AND.
โ If something breaks here
- Missing parentheses:
arr > 5 & arr < 10 fails. Use (arr > 5) & (arr < 10)
What this page does
Shows how to change array values using conditions.
Where this fits
You filtered arrays. Now modify them conditionally.
Code (this page)
import numpy as nparr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print("Original:", arr)
# Set all values greater than 5 to 0
arr[arr > 5] = 0
print("After setting >5 to 0:", arr)
# Reset and try another operation
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
arr[arr % 2 == 0] = -1 # Replace evens with -1
print("Evens replaced:", arr)
Explanation
Run the file:
Original: [ 1 2 3 4 5 6 7 8 9 10]
After setting >5 to 0: [1 2 3 4 5 0 0 0 0 0]
Evens replaced: [ 1 -1 3 -1 5 -1 7 -1 9 -1]
arr[condition] = value modifies elements in place:
- Only elements where condition is True are changed
- Original array is modified (not a copy)
Why this matters
Data cleaning often requires conditional modification. "Replace all negative values with 0" is one line.
โ If something breaks here
Nothing should break. Be careful: changes are permanent.
What this page does
Shows how to aggregate array values.
Where this fits
You filtered data. Now summarize it.
Code (this page)
import numpy as nparr = np.array([10, 20, 30, 40, 50])
print("Sum:", np.sum(arr))
print("Mean:", np.mean(arr))
print("Also sum:", arr.sum())
print("Also mean:", arr.mean())
Explanation
Run the file:
Sum: 150
Mean: 30.0
Also sum: 150
Also mean: 30.0
Two ways to call aggregation:
np.sum(arr) โ function style
arr.sum() โ method style
Both work identically. Choose based on readability.
Why this matters
Aggregation answers questions: "What's the total sales?" "What's the average score?"
โ If something breaks here
Nothing should break. These always work on numeric arrays.
What this page does
Shows additional aggregation functions.
Where this fits
Sum and mean are common. There's more.
Code (this page)
import numpy as nparr = np.array([45, 12, 78, 34, 56, 89, 23])
print("Minimum:", arr.min())
print("Maximum:", arr.max())
print("Index of min:", arr.argmin())
print("Index of max:", arr.argmax())
print("Standard deviation:", arr.std())
Explanation
Run the file:
Minimum: 12
Maximum: 89
Index of min: 1
Index of max: 5
Standard deviation: 25.21...
Key aggregations:
min(), max() โ smallest/largest value
argmin(), argmax() โ INDEX of smallest/largest
std() โ standard deviation
var() โ variance
Why this matters
Finding extremes and spread tells you about your data distribution.
โ If something breaks here
Nothing should break. These always work on numeric arrays.
What this page does
Shows how to aggregate rows or columns separately.
Where this fits
1D aggregation sums everything. 2D needs more control.
Code (this page)
import numpy as npmatrix = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
print("Matrix:\n", matrix)
print("Sum all:", matrix.sum())
print("Sum each column:", matrix.sum(axis=0))
print("Sum each row:", matrix.sum(axis=1))
Explanation
Run the file:
Matrix:
[[1 2 3]
[4 5 6]
[7 8 9]]
Sum all: 45
Sum each column: [12 15 18]
Sum each row: [ 6 15 24]
The axis parameter:
- No axis: aggregate everything
axis=0: aggregate down columns (collapse rows)
axis=1: aggregate across rows (collapse columns)
Why this matters
Datasets have rows (samples) and columns (features). You often need totals or means per column.
โ If something breaks here
- Confusing axis: 0 = down, 1 = across. Remember: "axis 0 disappears rows."
What this page does
Shows how to change an array's dimensions.
Where this fits
You created and aggregated arrays. Now transform their shape.
Code (this page)
import numpy as nparr = np.arange(12)
print("Original:", arr)
print("Shape:", arr.shape)
reshaped = arr.reshape(3, 4)
print("\nReshaped to 3x4:")
print(reshaped)
reshaped2 = arr.reshape(4, 3)
print("\nReshaped to 4x3:")
print(reshaped2)
Explanation
Run the file:
Original: [ 0 1 2 3 4 5 6 7 8 9 10 11]
Shape: (12,)Reshaped to 3x4:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
Reshaped to 4x3:
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]]
reshape(rows, cols) transforms dimensions:
- Total elements must stay the same (12 = 3ร4 = 4ร3)
- Data fills row by row
Why this matters
Machine learning often requires specific input shapes. Reshape transforms your data to fit.
โ If something breaks here
cannot reshape: The product of new dimensions must equal original size
What this page does
Shows automatic dimension calculation.
Where this fits
You manually specified both dimensions. Let NumPy infer one.
Code (this page)
import numpy as nparr = np.arange(24)
# Let NumPy figure out the number of columns
reshaped = arr.reshape(6, -1)
print("6 rows, auto columns:")
print(reshaped)
print("Shape:", reshaped.shape)
# Let NumPy figure out the number of rows
reshaped2 = arr.reshape(-1, 8)
print("\nAuto rows, 8 columns:")
print(reshaped2)
print("Shape:", reshaped2.shape)
Explanation
Run the file:
6 rows, auto columns:
[[ 0 1 2 3]
[ 4 5 6 7]
...
Shape: (6, 4)Auto rows, 8 columns:
[[ 0 1 2 3 4 5 6 7]
...
Shape: (3, 8)
-1 means "calculate this dimension automatically":
reshape(6, -1) with 24 elements โ (6, 4) because 24รท6=4
reshape(-1, 8) with 24 elements โ (3, 8) because 24รท8=3
Why this matters
When you know one dimension but not the other, use -1 and let NumPy do the math.
โ If something breaks here
- Multiple -1: Only one dimension can be inferred
What this page does
Shows how to convert any array to 1D.
Where this fits
Reshape adds dimensions. Flatten removes them.
Code (this page)
import numpy as npmatrix = np.array([
[1, 2, 3],
[4, 5, 6]
])
print("Original shape:", matrix.shape)
print("Flattened:", matrix.flatten())
print("Raveled:", matrix.ravel())
print("Reshaped to 1D:", matrix.reshape(-1))
Explanation
Run the file:
Original shape: (2, 3)
Flattened: [1 2 3 4 5 6]
Raveled: [1 2 3 4 5 6]
Reshaped to 1D: [1 2 3 4 5 6]
Three ways to flatten:
flatten() โ returns a copy
ravel() โ returns a view (faster, but changes affect original)
reshape(-1) โ equivalent to ravel
Why this matters
Sometimes you need all elements in a single row. Flattening is common for passing data to functions.
โ If something breaks here
Nothing should break. All three methods work.
What this page does
Shows how to swap rows and columns.
Where this fits
Reshape changes dimensions. Transpose swaps them.
Code (this page)
import numpy as npmatrix = np.array([
[1, 2, 3],
[4, 5, 6]
])
print("Original (2x3):")
print(matrix)
print("\nTransposed (3x2):")
print(matrix.T)
print("Also transposed:")
print(np.transpose(matrix))
Explanation
Run the file:
Original (2x3):
[[1 2 3]
[4 5 6]]Transposed (3x2):
[[1 4]
[2 5]
[3 6]]
Also transposed:
[[1 4]
[2 5]
[3 6]]
Transpose flips the matrix:
- Rows become columns
- Columns become rows
(2, 3) becomes (3, 2)
Why this matters
Linear algebra operations often require transposition. Matrix multiplication, least squares, and more use .T.
โ If something breaks here
Nothing should break. Transpose always works on 2D+ arrays.
What this page does
Shows how to combine arrays.
Where this fits
You reshaped arrays. Now join them.
Code (this page)
import numpy as npa = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print("Concatenate 1D:", np.concatenate([a, b]))
# 2D arrays
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6], [7, 8]])
print("\nStack vertically (more rows):")
print(np.concatenate([x, y], axis=0))
print("\nStack horizontally (more columns):")
print(np.concatenate([x, y], axis=1))
Explanation
Run the file:
Concatenate 1D: [1 2 3 4 5 6]Stack vertically (more rows):
[[1 2]
[3 4]
[5 6]
[7 8]]
Stack horizontally (more columns):
[[1 2 5 6]
[3 4 7 8]]
np.concatenate([arrays], axis):
axis=0: stack vertically (add rows)
axis=1: stack horizontally (add columns)
Why this matters
Building datasets often means combining arrays from different sources.
โ If something breaks here
- Shape mismatch: Arrays must have compatible shapes on the join axis
What this page does
Shows alternative ways to combine arrays.
Where this fits
Concatenate joins. Stack adds dimensions.
Code (this page)
import numpy as npa = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print("vstack (vertical):")
print(np.vstack([a, b]))
print("\nhstack (horizontal):")
print(np.hstack([a, b]))
print("\nstack (new dimension):")
print(np.stack([a, b]))
print("Shape:", np.stack([a, b]).shape)
Explanation
Run the file:
vstack (vertical):
[[1 2 3]
[4 5 6]]hstack (horizontal):
[1 2 3 4 5 6]
stack (new dimension):
[[1 2 3]
[4 5 6]]
Shape: (2, 3)
Convenience functions:
vstack: vertical stack (add rows)
hstack: horizontal stack (add columns or extend 1D)
stack: create a NEW dimension
Why this matters
vstack and hstack are more readable than concatenate with axis.
โ If something breaks here
- Shape mismatch: Arrays must have compatible shapes
What this page does
Explains when changes affect the original array.
Where this fits
You've modified arrays. Understand when copies are made.
Code (this page)
import numpy as nporiginal = np.array([1, 2, 3, 4, 5])
# Slicing creates a VIEW
view = original[1:4]
view[0] = 99
print("After modifying view:")
print("Original:", original)
print("View:", view)
# copy() creates a COPY
original = np.array([1, 2, 3, 4, 5])
copy = original[1:4].copy()
copy[0] = 99
print("\nAfter modifying copy:")
print("Original:", original)
print("Copy:", copy)
Explanation
Run the file:
After modifying view:
Original: [ 1 99 3 4 5]
View: [99 3 4]After modifying copy:
Original: [1 2 3 4 5]
Copy: [99 3 4]
Critical concept:
- View: Points to original data. Changes affect both.
- Copy: Independent data. Changes are isolated.
Slices are views by default. Use
.copy() when you need independence.
Why this matters
This is a common source of bugs. Knowing when you have a view prevents accidental data corruption.
โ If something breaks here
Nothing breaks, but unexpected changes happen if you forget about views.
What this page does
Shows conditional element selection.
Where this fits
Boolean indexing filters. Where transforms.
Code (this page)
import numpy as nparr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
# Replace based on condition
result = np.where(arr > 5, arr, 0)
print("Keep if >5, else 0:", result)
# Different values for true/false
result2 = np.where(arr % 2 == 0, "even", "odd")
print("Even or odd:", result2)
Explanation
Run the file:
Keep if >5, else 0: [ 0 0 0 0 0 6 7 8 9 10]
Even or odd: ['odd' 'even' 'odd' 'even' 'odd' 'even' 'odd' 'even' 'odd' 'even']
np.where(condition, if_true, if_false):
- Evaluates condition for each element
- Returns
if_true where True, if_false where False
- Like a vectorized if-else
Why this matters
Where is cleaner than boolean indexing when you need both branches (true AND false cases).
โ If something breaks here
- Shape mismatch: if_true and if_false must broadcast to the condition shape
What this page does
Confirms you have mastered all skills in this guide.
Where this fits
This is the end. Verify everything works together.
Explanation
Complete this final test in your numpy_basics.py:
import numpy as np# Create data
data = np.random.randint(0, 100, 20)
print("Data:", data)
# Reshape to 4x5
matrix = data.reshape(4, 5)
print("\nAs 4x5 matrix:")
print(matrix)
# Column means
print("\nColumn means:", matrix.mean(axis=0))
# Filter values above average
avg = data.mean()
print(f"\nAverage: {avg:.2f}")
print("Above average:", data[data > avg])
# Replace below average with 0
cleaned = np.where(data >= avg, data, 0)
print("Cleaned:", cleaned)
print("\nโ NumPy Fundamentals Complete!")
Run it. If you understand every line, you've completed the guide.
Why this matters
You now have the foundation for data science in Python. NumPy skills unlock Pandas, Matplotlib, Scikit-learn, and beyond.
โ If something breaks here
- Review the specific section where the error occurred
- Each skill builds on the previous one