Data Structures Interview Questions and Answers
What is a Data Structure?
A data structure is a way of organizing and storing data in a computer so that it can be used efficiently. Different data structures are suited to different kinds of tasks, offering various trade-offs in terms of speed, memory usage, and ease of implementation.
Types of Data Structures
- Linear Data Structures: Elements are arranged sequentially (arrays, linked lists, stacks, queues).
- Non-linear Data Structures: Elements are not arranged sequentially (trees, graphs).
Applications of Data Structures
Data structures are fundamental to many areas of computer science:
- Compiler design
- Operating systems
- Database management systems
- Graphics
- Artificial intelligence
- Network programming
File Structure vs. Storage Structure
A storage structure refers to how data is organized in computer memory (RAM). A file structure refers to how data is organized in secondary storage (hard disk, etc.).
Data Structures in Different Data Models
Data Model | Data Structure |
---|---|
Relational Database Management System (RDBMS) | Arrays |
Network Data Model | Graphs |
Hierarchical Data Model | Trees |
Data Structure Used for Recursion
Stacks are used to manage function calls during recursion due to their Last-In, First-Out (LIFO) nature. The system uses a stack to store local variables and return addresses.
Stacks (LIFO)
A stack is a LIFO (Last-In, First-Out) data structure. Elements are added (pushed) and removed (popped) from the top.
Applications of Stacks
- Expression evaluation
- Function call management
- Backtracking algorithms
- Undo/redo functionality
Stack Operations
push()
: Adds an element to the top.pop()
: Removes and returns the top element.peek()
: Returns the top element without removing it.
Stack Overflow
A stack overflow occurs when you try to push an element onto a full stack.
PUSH and POP Operations
push()
adds an element to the top of the stack. pop()
removes and returns the top element.
Stack Insertion and Deletion Steps
- Push: Increment the top pointer; add the element at the new top position.
- Pop: Return the value at the top; decrement the top pointer.
Postfix Expressions
In postfix notation (reverse Polish notation), operators follow their operands (e.g., a b +
instead of a + b
).
Postfix Notation Example
The infix expression (A + B) * (C - D)
is written as AB+CD-*
in postfix notation.
Notations for Arithmetic Expression Evaluation
Polish notation (prefix notation) and Reverse Polish notation (postfix notation) are used for efficiently evaluating arithmetic expressions without needing parentheses.
Arrays
An array is a collection of elements of the same data type stored contiguously in memory. Elements are accessed using their index (position).
Referencing Array Elements
Array elements are accessed using their index (starting from 0). A loop is typically used to access all the elements.
Multidimensional Arrays
Multidimensional arrays are arrays of arrays. A 2D array can be visualized as a matrix (rows and columns).
Memory Storage of 2D Arrays
- Row-major order: Rows are stored contiguously in memory.
- Column-major order: Columns are stored contiguously in memory.
Calculating the Address of an Element in a 2D Array
The address of an element a[i][j]
in a 2D array is calculated as:
Row-Major Order
Address(a[i][j]) = BaseAddress + (i * numCols + j) * elementSize
Column-Major Order
Address(a[i][j]) = BaseAddress + (j * numRows + i) * elementSize
Linked Lists
A linked list is a linear data structure where elements (nodes) are linked together. Each node contains data and a pointer to the next node.
Linked Lists: Linear or Non-Linear?
Linked lists are considered linear in terms of data access order but non-linear in terms of physical memory allocation (nodes can be scattered in memory).
Advantages of Linked Lists over Arrays
- Dynamic size.
- Efficient insertion and deletion of elements.
- Memory efficiency (don't need contiguous memory).
Creating a Singly Linked List Node (C)
Syntax
struct node {
int data;
struct node *next;
};
Heterogeneous Linked Lists in C
For a heterogeneous linked list (containing different data types) in C, use a void*
pointer to store pointers to various data types.
Doubly Linked Lists
Each node in a doubly linked list has pointers to both the next and previous nodes, allowing bidirectional traversal.
Inserting a Node at the Beginning of a Circular Singly Linked List (C)
C Code
// ... (C code for inserting a node at the beginning of a circular linked list would go here) ...
Queues (FIFO)
A queue is a First-In, First-Out (FIFO) data structure. Elements are added to the rear and removed from the front.
Applications of Queues
- Managing tasks or requests (e.g., printer queue).
- Buffering data in systems with varying data transfer rates.
Queue Implementation Drawbacks Using Arrays
Using arrays to implement queues has limitations:
- Wasted Memory: Once an element is dequeued, the space it occupied cannot be immediately reused.
- Fixed Size: Arrays have a fixed size; increasing the queue's capacity requires creating a new, larger array and copying elements.
Circular Queue Insertion Scenarios
Insertion in a circular queue depends on its current state:
- If the queue is full (
(rear + 1) % maxSize == front
), an overflow occurs. - If the queue is not full and the
rear
pointer isn't at the end of the array, incrementrear
and insert at the newrear
. - If the queue is not full and the
rear
pointer is at the end of the array, wrap around by settingrear
to 0 and inserting.
Deques (Double-Ended Queues)
A deque (double-ended queue) allows insertion and deletion at both ends (front and rear).
Implementing Priority Queues Using Queues
You can implement a priority queue using at least two queues: one to store data and another to store priorities. More sophisticated data structures (like heaps) are often more efficient.
Tree Data Structures
A tree is a hierarchical data structure with a root node and zero or more subtrees. Trees are used to represent hierarchical relationships.
Types of Trees
- General Tree
- Forest (a collection of trees)
- Binary Tree
- Binary Search Tree (BST)
- Expression Tree
- Heap
Binary Trees
A binary tree is a tree where each node has at most two children (left and right).
In-Order Traversal of a Binary Tree (C)
C Code
void inorder(struct treenode *tree) {
if (tree != NULL) {
inorder(tree->left);
printf("%d ", tree->data); // Assuming 'data' is a member of treenode
inorder(tree->right);
}
}
Maximum Nodes in a Binary Tree of Height k
The maximum number of nodes in a binary tree of height k is 2k+1 - 1 (where k ≥ 0).
Data Structure for Tree Construction
Queues are commonly used to implement tree traversals (like breadth-first search).
Counting Nodes in a Binary Tree (Recursive C Function)
C Code
int countNodes(struct node* node) {
if (node == NULL) return 0;
return 1 + countNodes(node->left) + countNodes(node->right);
}
Calculating the Height of a Binary Tree (Recursive C Function)
C Code
int treeHeight(struct node* node) {
if (node == NULL) return 0;
int leftHeight = treeHeight(node->left);
int rightHeight = treeHeight(node->right);
return 1 + max(leftHeight, rightHeight);
}
AVL Trees
AVL trees are self-balancing binary search trees that ensure all operations (search, insertion, deletion) have a time complexity of O(log n).
B-Tree Properties
- Each node has a maximum number of children (order m).
- Non-leaf nodes have at least ⌈m/2⌉ children (except for the root).
- All leaf nodes are at the same level.
B-Tree vs. B+ Tree
B-Tree | B+ Tree |
---|---|
Data can be stored in both internal and leaf nodes. Slower search. | Data is stored only in leaf nodes. Faster search due to sequential access of leaf nodes. |
Applications of Tree Data Structures
- Representing hierarchical data.
- Implementing decision trees.
- Syntax analysis.
Graphs
Graph Data Structure
A graph is a data structure representing a set of vertices (nodes) and edges (connections) between them. Graphs model relationships between objects.
Cycle, Path, and Circuit in Graphs
- Path: A sequence of connected vertices.
- Cycle: A closed path (starting and ending at the same vertex).
- Circuit: A closed path where vertices can be repeated.
Data Structures for Graph Implementation
- Adjacency matrix: A 2D array representing connections.
- Adjacency list: A list of linked lists representing connections.
Breadth-First Search (BFS) and Depth-First Search (DFS)
BFS and DFS are graph traversal algorithms that use queues and stacks, respectively, to explore the graph.
Applications of Graphs
- Network routing.
- Social network analysis.
- Representing relationships.
Binary Search
Binary search is an efficient algorithm to find a specific element in a sorted array or list. It works by repeatedly dividing the search interval in half.
Advantages of Binary Search over Linear Search
Binary search has a time complexity of O(log n), significantly faster than linear search's O(n) complexity for large datasets. However, it requires a sorted list.
Advantages of Selection Sort
- Simple implementation.
- In-place sorting.
- Generally outperforms bubble sort.
Multi-linked Structures
Multi-linked structures (like doubly linked lists) use multiple pointers in each node to establish connections, allowing for more complex data relationships and traversal patterns.
NULL vs. VOID in C
NULL | VOID |
---|---|
A null pointer; represents the absence of a valid memory address. | A data type specifier indicating the absence of a type. |