Table of contents

Overview
Project 1: Probability, Data, and Conditional Reasoning
Project 2: Matrices and Vectors
Project 3: Building and Training a Neuron

Overview

You will have a project component to the course.

Project 1: Probability, Data, and Conditional Reasoning

Due Date: Wednesday, February 18th, 11:59PM

Get help early!

This project is meant to be a straightforward extension of what you are learning in class and recitation, but it still may be challenging for you! Make sure you’re getting started early and looking for help!

Mohi’s office hours

Feel free to attend Kenny’s office hours, listed on the syllabus, or Mohi’s, which are from 2-3PM Mondays in the first floor atrium of Davis Hall (Mohi will find a table and post up!)

This is an individual assignment, please ensure you follow the rules of Academic Integrity established by UB and in the course!

This project is designed to introduce you to working with data in Python using basic Python features (such as lists, dictionaries, loops, and functions) and NumPy, while reinforcing key ideas from Chapters 1 and 2 of the course, including sets, counting, probability, conditional probability, and Bayes’ rule.

You will complete the project in three parts. Each part builds toward using real data to reason about uncertainty in a meaningful real-world context. You can check this link for help.

Part 1: Sets, Counting, and Probability (30 points)

In this part, you will practice basic Python operations and use code to compute quantities that would be difficult or tedious to calculate by hand.

Using Python sets, define at least two sets (can be defined using any random functions in Python or manually defined by you) and compute:
- their union
- their intersection
- their difference
Briefly explain what each operation represents.
(10 points)
Consider a simple probability scenario (for example, drawing cards from a deck, rolling dice, or another discrete experiment).
- Use Python to compute the probability of an event of interest. Your code should be able to receive inputs and print the proper output.
- Explain why using code is helpful for this calculation.
(20 points)

Part 2: Conditional Probability (30 points)

In this part, you will use data visualization to explore conditional probabilities.

Create a data structure to represent a small dataset using lists or NumPy arrays (for example, a list of lists or a 2D NumPy array where rows represent observations and columns represent variables).
(5 points)
Use basic Python loops or NumPy operations to compute summary statistics (such as counts or frequencies) for your data and print them in a readable format (e.g., as a text-based table).
(10 points)
Compute one conditional probability of your own choice and explain what it means.
(10 points)
Compute one conditional probability of your own choice and explain what it means.
(5 points)

Part 3: Bayes’ Rule and Real-World Data (40 points)

The dataset for this project concerns COVID-19 symptoms and test results.
You can download the CSV file from here.

Each row represents an individual, and columns indicate whether that person tested positive for COVID-19 and whether they experienced certain symptoms, as well as Gender, Age, and City.

Use Python’s built-in csv module to read the file into a list of lists or dictionaries, then convert relevant parts to NumPy arrays for computations if needed. Avoid using any external libraries beyond NumPy for data processing.

Compute the probability of a positive COVID-19 test in this dataset.
(10 points)
Compute at least two conditional probabilities involving symptoms (for example, the probability of a symptom given a positive test, or the probability of a positive test given a symptom).
(10 points)
Use Bayes’ rule to compute other conditional probabilities indirectly. Use your previous examples.
(10 points)
Interpret your result in plain language. What does this probability mean, and why is Bayes’ rule useful in this context?
(10 points)

Submission Instructions

Submit a single Jupyter notebook that includes:

All code used in the project
Written explanations in Markdown cells
Deadline: Feb 18th
Upload location: UBLearns → Course Page → Assignments → Project 1

Clarity of explanation and correctness of reasoning are more important than code complexity.

Project 2: Matrices and Vectors

Matrix Vector Multiplication

Please read the provided two scenarios carefully and then proceed to do the tasks.

You should come up with a way to INPUT the matrices and vectors for any scenario you’re coming up with. Please don’t use the same students example, you can use service rating, warehouse storage or other scenarios.

Scenario

A school wants a simple scoring rule to estimate whether a student is likely to pass a quiz.

Each student has two inputs:

x1 = hours studied
x2 = hours of sleep

The scoring rule uses a weight vector:

w = [2
     1]

This means studying counts twice as much as sleep. The score for one student is:

s = x · w = 2x1 + 1x2

Given Data

The input data for 6 students is stored in the matrix X, where each row is one student:

X = [1 6
5
4
3
2
1]

Compute the score vector:

s = Xw

Sample Answer

Given

Compute

s = Xw

by multiplying each row of X by w:

s = [(1)(2) + (6)(1)
     (2)(2) + (5)(1)
     (3)(2) + (4)(1)
     (4)(2) + (3)(1)
     (5)(2) + (2)(1)
     (6)(2) + (1)(1)]
  = [8
     9
     10
     11
     12
     13]

If the passing rule is s ≥ 10, then the predicted label vector is:

because 8 < 10, 9 < 10, and 10, 11, 12, 13 ≥ 10.

If we change the weight vector to:

w' = [1
      2]

then:

s' = Xw'
   = [(1)(1) + (6)(2)
      (2)(1) + (5)(2)
      (3)(1) + (4)(2)
      (4)(1) + (3)(2)
      (5)(1) + (2)(2)
      (6)(1) + (1)(2)]
   = [13
      12
      11
      10
      9
      8]

Tasks

Come up with a scenario with 3 inputs.
Compute the score for each desired case using matrix vector multiplication.
List the scores as a column vector.
Define a passing rule and check if that happens for each case.
Answer these questions:
- (a) Which cases are predicted to pass
- (b) How many cases are predicted to pass
- (c) Which case has the highest score
- (d) What happens to all scores if we change the weight vector to:

w = [1
     2
     3]

Explain what this means in words.

Expected Output Format

Students should print results in a clear format such as:

Case number
Input vector [x1, x2, x3]
Score s
Prediction y_hat

Two Matrix Multiplication

Scenario

A teacher has two raw grades for each student:

Homework
Quizzes

The teacher wants to convert these raw grades into two new scores that will appear on the report card:

Knowledge Score
Participation Score

The idea is:

Knowledge Score should depend more on Quizzes.
Participation Score should depend more on Homework.

Given Data

The raw grade matrix for 4 students is:

X = [9  7
8
5
6]

Weight Matrix

The teacher chooses a weight matrix W that turns HW, Quiz into Knowledge, Participation:

W = [1 2
     3 1]

This means:

Knowledge Score = 1·HW + 3·Quiz
Participation Score = 2·HW + 1·Quiz

Compute the report card scores using:

Y = XW

Answer

Row 1 of Y (student 1):

y11 = (9)(1) + (7)(3) = 30
y12 = (9)(2) + (7)(1) = 25

Row 2 of Y (student 2):

y21 = (6)(1) + (8)(3) = 30
y22 = (6)(2) + (8)(1) = 20

Row 3 of Y (student 3):

y31 = (10)(1) + (5)(3) = 25
y32 = (10)(2) + (5)(1) = 25

Row 4 of Y (student 4):

y41 = (7)(1) + (6)(3) = 25
y42 = (7)(2) + (6)(1) = 20

So the final report card score matrix is:

Tasks

Come up with a scenario of the same nature.
Compute the score for each desired case using matrix multiplication.
List the scores.
Define a rule (can be mean of the scores or separated rule for each part) and check if that happens for each case.
Answer these questions:
- (a) Which cases are predicted to pass
- (b) How many cases are predicted to pass
- (c) Which case has the highest average score

Expected Output Format

Students should print results in a clear format such as:

Case number
Input Matrix [x1, x2]
Score Vector s
Prediction y_hat

Final Deliverables

Students will submit:

A Notebook that includes their code and prints results.
Short written explanation INSIDE the notebook markdowns.

Project 3: Building and Training a Neuron

1. Introduction

This project builds a beginner friendly machine learning model that connects vectors and matrices to neural networks. The model used here is a single neuron, which is also called logistic regression. The purpose of the project is to show how an input vector can be combined with a weight vector and a bias to produce a probability between 0 and 1. This gives a simple example of how neural networks work at their most basic level.

2. A Motivating Example

Suppose we want to classify points into one of two groups using two input features. Each data point has two values, and the goal is to predict whether the label should be 0 or 1. Instead of making the prediction by hand each time, we want a model that learns a rule from the data.

This project follows the same general learning process used in class:

Define a model class
Define a loss or error idea
Define an optimization method to improve the model

3. Model Class

The model class in this project is a single neuron. Each example is represented by a vector:

\[\mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}\]

The model parameters are a weight vector and a bias:

\[\mathbf{w} = \begin{bmatrix} w_1 \\ w_2 \end{bmatrix}, \qquad b \in \mathbb{R}\]

The linear score is computed as:

\[z = \mathbf{w} \cdot \mathbf{x} + b\]

This score is then passed through the sigmoid function to get a probability:

\[p = \sigma(z) = \frac{1}{1 + e^{-z}}\]

The final class prediction is:

\[\hat{y} = \begin{cases} 1 & \text{if } p \geq 0.5 \\ 0 & \text{if } p < 0.5 \end{cases}\]

4. Loss and Error Idea

The model makes a probability prediction $p$, and we compare it to the true label $y$. A simple error term for one example is:

\[\text{error} = p - y\]

This error tells us whether the model predicted too high or too low. If the error is positive, the prediction was too large. If the error is negative, the prediction was too small.

5. Optimization with Gradient Descent

To improve the model, we update the weights and bias using gradient descent. For a learning rate $\alpha$, the update rules are:

\[w_1 \leftarrow w_1 - \alpha \cdot \text{error} \cdot x_1\] \[w_2 \leftarrow w_2 - \alpha \cdot \text{error} \cdot x_2\] \[b \leftarrow b - \alpha \cdot \text{error}\]

These updates are repeated over the dataset until the model improves.

6. Vector and Matrix Representation

The full dataset can be stored as an input matrix:

\[X = \begin{bmatrix} x_{11} & x_{12} \\ x_{21} & x_{22} \\ \vdots & \vdots \\ x_{n1} & x_{n2} \end{bmatrix}\]

and a label vector:

\[\mathbf{y} = \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{bmatrix}\]

This matrix form makes it easier to compute predictions for many examples and connects directly to the linear algebra ideas discussed in class.

7. Implementation Steps

The notebook for this project should follow these steps:

Create a small dataset with two input features and binary labels
Store the data as a matrix $X$ and a label vector $\mathbf{y}$
Write a dot product function
Write the sigmoid function
Write a prediction function that returns probabilities
Train the model using gradient descent
Evaluate the model using accuracy
Display a small prediction table and explain what the learned weights mean

8. Results

After training, the model should report the final learned parameters, predicted values, and classification accuracy.

Example	x1	x2	True Label	Predicted Probability	Predicted Label
1	0.2	1.1	0	0.31	0
2	1.4	2.0	1	0.84	1
3	0.8	0.5	0	0.42	0
4	1.7	1.3	1	0.76	1

Table 1. Example predictions from the trained single neuron model.

8.1 Accuracy

The classification accuracy is computed as:

\[\text{Accuracy} = \frac{\text{Number of Correct Predictions}}{\text{Total Number of Predictions}}\]

9. Interpretation of the Weights

The learned weights show how strongly each feature affects the prediction. A positive weight means that increasing that feature makes class 1 more likely. A negative weight means that increasing that feature makes class 1 less likely. The bias shifts the decision boundary.

10. What the Final Submission Should Look Like

The final submission should be a completed Jupyter Notebook. It should not just be raw code by itself. The notebook should combine short markdown explanations with code cells and printed results.

A strong submission should include the following parts in order:

A title and short introduction explaining the purpose of the project
A motivating example or short explanation of the classification task
A code cell that creates a small dataset with two input features and binary labels
A markdown explanation of the input vector, weight vector, bias, and dataset matrix
Code cells for the dot product function and sigmoid function
A markdown explanation of how the model computes:

\[z = \mathbf{w} \cdot \mathbf{x} + b\]

and converts that score into a probability.

A code cell for the prediction function
A markdown explanation of gradient descent and the update rule
A code cell that trains the model by updating the weights and bias
A results section that prints the final weights, final bias, and model accuracy
A small example prediction table with actual values
A short concluding explanation of what the learned weights mean

The final notebook should read like a guided walkthrough. It should include both code and short written explanations inside markdown cells. The example prediction table should be filled in with real model outputs, not left blank.

11. Conclusion

This project shows how a simple neural network unit can be built from basic linear algebra. By representing data as vectors and matrices, computing a dot product, applying the sigmoid function, and updating parameters with gradient descent, we can train a beginner friendly classification model. This makes the connection between linear algebra and neural networks much easier to see.