Last Minute Notes (LMNs) Artificial Intelligence (original) (raw)

Last Updated : 23 Jul, 2025

Artificial Intelligence (AI) refers to the simulation of human intelligence processes by machines, especially computer systems. AI encompasses tasks like learning, reasoning, problem-solving, perception, and language understanding. The ultimate goal of AI is to create systems that can perform tasks that would typically require human intelligence, such as recognizing speech, making decisions, and visual perception.

Search Algorithms in AI

Search problems in AI involve finding a path from an initial state to a goal state. It consists of defining the environment, the available actions, and the rules for transitioning between states.

**Uninformed search algorithms do not have any additional information about the goal beyond the problem's definition. They explore the search space without any heuristics.

**a) **Breadth-First Search (BFS)

BFS uses a queue (FIFO structure) to explore all nodes at the current depth level before moving on to nodes at the next level. It starts at the root node and explores its neighbors first, then their neighbors, and so on.

The algorithm ensures that the shallowest goal is found first, making it complete for finite search spaces.

**Complete: Yes (if the search space is finite)
**Time complexity: O(b^d), where b is the branching factor and d is the depth of the shallowest goal
**Space complexity: O(b^d)

**b) **Uniform Cost Search (UCS)

UCS uses a priority queue where nodes are prioritized by their path cost (g(n)). Unlike BFS, it expands the node with the least cumulative cost, ensuring that it always finds the optimal path.

**Complete: Yes
**Time complexity: O(b^d), where b is the branching factor and d is the depth
**Space complexity: O(b^d)

**c) **Depth-First Search (DFS)

DFS uses a stack (LIFO structure) to explore as far down a branch as possible before backtracking. It is memory efficient but may fail to find a solution in infinite-depth spaces or loops.

Not guaranteed to find the solution (if the search space has cycles)
**Time complexity: O(b^m) where m is the maximum depth
**Space complexity: O(bm)

**d) **Depth-Limited Search

Depth limited search is a variation of DFS, where a depth limit is imposed to prevent infinite recursion or exploring too deep. It stops searching when the depth limit is reached, even if the goal is not found.

Not guaranteed to find a solution if the depth limit is too small
**Time complexity: (b^l), where l is the limit
**Space complexity: O(bl)

**e) **Iterative Deepening Depth-First Search (IDDFS)

IDDFS combines the memory efficiency of DFS and the completeness of BFS. It performs a series of depth-limited searches, incrementally increasing the depth limit until the goal is found.

**Complete: Yes.
**Time complexity: O(b^d) same as BFS, but less memory-intensive.
**Space complexity: O(bd), where d is the depth of the solution.

**f) **Bidirectional Search

Bidirectional search runs two simultaneous searches: one from the start state and the other from the goal state. The searches meet in the middle, significantly reducing the search space.

**Complete: Yes
**Time complexity: O(b^{d/2})
**Space complexity: O(b^{d/2})

**2. Informed Search (Heuristic Search)

**Informed search algorithms use problem-specific knowledge (heuristics) to find the solution more efficiently by guiding the search process.

**Heuristic Function is used to guide the search algorithm by estimating the cost to reach the goal from a given state.

Heuristic values should be greater than or equal to zero.

Heuristic value depends on the current state.

**a) **Greedy Best-First Search

Greedy best-first search prioritizes nodes based on heuristic functions h(n) that estimates the cost to reach the goal from a node. The algorithm does not guarantee an optimal solution, as it focuses solely on the heuristic.

Not guaranteed to find the optimal solution.
**Time complexity: O(b^d)
**Space complexity: O(b^d)

**b) **A * Search Algorithm

A* search algorithm combines uninformed cost search and greedy best-first search by evaluating nodes using f(n) = g(n) + h(n), where g(n) is the cost to reach the node and h(n) is the heuristic estimate of the cost from the node to the goal. A* search guarantees optimality if the heuristic is admissible (never overestimates) and consistent.

**Complete: Yes.
**Optimal: Yes, if the heuristic is admissible (never overestimates the cost).
**Time complexity: O(b^d)
**Space complexity: O(b^d)

**c) **Iterative Deepening A* Search (IDA*)

IDA* search combines A* and iterative deepening to overcome A*'s memory limitation. IDA* performs a series of depth-limited searches where the depth limit is based on the f(n) value instead of depth in the search tree.

In each iteration, nodes are explored only if their f(n) value is within the current threshold. The threshold is updated iteratively to the smallest f(n) value that exceeded the current threshold, ensuring an optimal solution.

**Complete: Yes (if the branching factor is finite).
**Optimal: Yes (if the heuristic is admissible and consistent).
**Time Complexity: O(b^d), where b is the branching factor and d is the depth of the solution.
**Space Complexity: O(bd), as it only keeps track of the current path and threshold values.

3. Adversarial Search

**Adversarial search is used when multiple agents compete to maximize their payoff while minimizing the opponent’s payoff.

**a) **Minimax Algorithm

Minimax algorithm computes optimal decisions in two-player games, assuming both players play optimally. The algorithm recursively evaluates all possible moves to choose the best one for the current player.

**Completeness: Guarantees the optimal solution in two-player, zero-sum games with perfect information.
**Optimality: Ensures optimal decisions when both players play optimally.
**Time Complexity: O(b^d), where b is the branching factor and d is the maximum depth.
**Space Complexity: O(b^d), as it stores the entire search tree.

**b) **Alpha-Beta Pruning

Alpha-beta pruning optimizes Minimax by pruning branches that do not influence the final decision. It uses two values, \alpha (best already explored option for the maximizer) and \beta (best for the minimizer), to prune irrelevant nodes.

**Completeness: The algorithm remains complete, finding the optimal solution if both players play optimally.
**Optimality: Guarantees optimal decisions when used with Minimax.
**Time Complexity: Significantly improved, but still O(b^d) in the worst case.
**Space Complexity: Same as the time complexity, O(b^d).

Logic in Artificial Intelligence

**1. Propositional Logic

**Propositional Logic (Boolean Logic) deals with statements that are either true or false. The simplest form of logic used to represent facts about the world and manipulate those facts.

Common logical connectives are:

**AND (∧): True if both propositions are true.
**OR (∨): True if at least one of the propositions is true.
**NOT (¬): Inverts the truth value of a proposition.
**IMPLIES (→): If the first proposition is true, then the second one must be true.
**BICONDITIONAL (↔): Both propositions must be true or false for the entire statement to be true.

Truth table is used to determine the truth value of compound propositions. For example for P\rightarrow Q:

P	Q	P → Q
T	T	T
T	F	F
F	T	T
F	F	T

Propositional logic cannot handle environments of unlimited size effectively because it lacks the ability to express concepts related to time, space, and universal relationships between objects in a concise manner.

**2. Predicate Logic

**Predicate Logicrepresent and reason about statements involving objects, their properties and their relationship. The object represent entities in the domain of discourse (e.g., "John," "car," or "number").

It extends propositional logic by introducing:

**Predicates: Represent properties or relationships (e.g., Student(x), Likes(x,y)).
**Quantifiers: Universal (\forall)and existential (\exists).
**Variables: Represent entities in a domain of discourse (e.g., x,y).

**First-order logic is a specific type of predicate logic with additional restrictions:

**Constants: Names for specific objects (e.g., John,2,a).
**Variables: Generic placeholders for objects (e.g., x, y, z).
**Predicates: Represent properties or relationships (e.g., Student(x), Taller(x,y)).
**Functions: Represent mappings (e.g., Father(x)).
**Quantifiers: Express properties of entire collection of objects. First-order logic contains two standard quantifiers:
- **Universal Quantifier (∀): States that a proposition is true for all values of a variable. Example: \forall x \, Student(x) \to Studies(x): "All students study."
- *Existential Quantifier (\exists)*: States that a property applies to at least one object in the domain.
  Example: \exists x \, Likes(x, IceCream): "Someone likes ice cream."
- \forall and \exists are interrelated and often interchangeable via negation.
**Logical Connectives:\land, \lor, \neg, \to, \leftrightarrow

**Inference in Predicate Logic

**Universal Instantiation: If a universal statement is true, we can instantiate it for a particular object.
Example: From "∀x, Is_Student(x)", we can conclude "Is_Student(John)".
**Existential Generalization: From a statement about an individual, we can generalize that there exists such an individual.
Example: From "Is_Student(John)", we can conclude "∃x, Is_Student(x)".

Reasoning Under Uncertainty

AI systems often work with uncertain information. This uncertainty arises due to:

**Incomplete data: We don’t always have all the information needed.
**Ambiguity: Multiple interpretations of the data.
**Noise: Random variations or errors in the data.

To handle uncertainty, AI uses tools like probabilistic reasoning to make inferences about the most likely outcomes.

Conditional Independence Representation

Conditional independence representation deals with uncertainty and probabilistic models. It allows us to simplify complex probabilistic relationships by assuming that two events are independent given some third event.

Two variables X and Y are conditionally independent given Z if the probability of X given Z is unaffected by Y:

P(X | Y, Z) = P(X | Z)

In a Bayesian network, conditional independence is encoded by the network structure:

A node is conditionally independent of its non-descendants given its parents.
For two nodes X and Y with a common parentZ, X \text{and} Y are conditionally independent given Z.

D-separation to determine conditional independence in a directed acyclic graph (DAG). For example, a set of nodes Z blocks a path between X and Y if:

Z contains a **collider (node with two incoming edges), and the collider is not in Z or any of its descendants.

X --- **A --- Y (Collider at **A)
|
Z (Z blocks the path between X and Y)
Other paths are blocked if Z is in the path.

X ****--- A --- Y**
|
Z (Z blocks the path between X and Y)

Exact Inference Through Variable Elimination

Variable elimination is a method used in probabilistic graphical models for exact inference. It allows for the computation of marginal probabilities by systematically eliminating variables that are not of interest.

The variable elimination algorithm can be summarized in the following steps:

**Initialization:
- Identify all random variables involved in the model.
- Define the sets of query variables Y, evidence variables E, and hidden variables Z.
**Evidence Handling: Incorporate evidence into the computation using evidence potentials, which simplify the summation and conditioning processes.
**Elimination Process: Iteratively eliminate each variable in Z by summing over its values. For each variable v being eliminated:
- Identify all factors that include v.
- Multiply these factors together.
- Sum out v to create a new factor that does not include v.
**Normalization: After all relevant variables have been eliminated, normalize the resulting factor to obtain a proper probability distribution.

Approximate Inference Through Sampling

Approximate inference is used to estimate probabilities or make decisions when exact inference is too computationally expensive. Sampling methods randomly sample from the probability distribution to estimate the result. This allows us to estimate complex probabilities quickly, even in large models.

**1. Direct sampling methods involve generating random samples from the probability distribution of a model and using these samples to approximate the desired probability.

**Basic Idea: Involves creating many samples based on the given model's distributions and then using these samples to compute estimates of the quantities of interest (like marginal probabilities).
**Monte Carlo Sampling: This is a common form of direct sampling. It generates samples from the joint distribution and uses the samples to estimate quantities like the probability of a given event.

For example: If you want to estimate **P(A), you would generate random samples from the model, count how many times **A is true, and divide by the total number of samples to get the approximate probability of **A.

**2. **Rejection Sampling is a technique used to sample from complex distributions by generating samples from a simpler, easier-to-sample distribution and rejecting those that do not meet certain criteria. The process of random sampling involves following steps

**Sample Generation: Generate a point from a proposal distribution (a distribution from which sampling is straightforward).
**Acceptance Criteria: Accept the sample if it meets the criteria based on the target probability distribution; otherwise, reject it and repeat the process.

**3. **Likelihood Weighting is an advance sampling technique used in Bayesian networks to handle observed evidence. The process involves following steps:

**Sampling Order: Sample from the network's variables in a specific order, excluding the evidence variables (those with known values).
**Weighting Samples: For each sample, assign a weight based on the likelihood of the evidence given that sample.
**Weight Adjustment: Adjust the weight of each sample according to how well it matches the evidence, giving more importance to samples that align closely with the observed data.
**Iteration: Repeat this process multiple times to create a collection of weighted samples.

This method allows for more efficient inference in Bayesian networks compared to basic rejection sampling.