Data Structures Interview Questions and Answers

What is a Data Structure?

A data structure is a way of organizing and storing data in a computer so that it can be used efficiently. Different data structures are suited to different kinds of tasks, offering various trade-offs in terms of speed, memory usage, and ease of implementation.

Types of Data Structures

  • Linear Data Structures: Elements are arranged sequentially (arrays, linked lists, stacks, queues).
  • Non-linear Data Structures: Elements are not arranged sequentially (trees, graphs).

Applications of Data Structures

Data structures are fundamental to many areas of computer science:

  • Compiler design
  • Operating systems
  • Database management systems
  • Graphics
  • Artificial intelligence
  • Network programming

File Structure vs. Storage Structure

A storage structure refers to how data is organized in computer memory (RAM). A file structure refers to how data is organized in secondary storage (hard disk, etc.).

Data Structures in Different Data Models

Data Model Data Structure
Relational Database Management System (RDBMS) Arrays
Network Data Model Graphs
Hierarchical Data Model Trees

Data Structure Used for Recursion

Stacks are used to manage function calls during recursion due to their Last-In, First-Out (LIFO) nature. The system uses a stack to store local variables and return addresses.

Stacks (LIFO)

A stack is a LIFO (Last-In, First-Out) data structure. Elements are added (pushed) and removed (popped) from the top.

Applications of Stacks

  • Expression evaluation
  • Function call management
  • Backtracking algorithms
  • Undo/redo functionality

Stack Operations

  • push(): Adds an element to the top.
  • pop(): Removes and returns the top element.
  • peek(): Returns the top element without removing it.

Stack Overflow

A stack overflow occurs when you try to push an element onto a full stack.

PUSH and POP Operations

push() adds an element to the top of the stack. pop() removes and returns the top element.

Stack Insertion and Deletion Steps

  1. Push: Increment the top pointer; add the element at the new top position.
  2. Pop: Return the value at the top; decrement the top pointer.

Postfix Expressions

In postfix notation (reverse Polish notation), operators follow their operands (e.g., a b + instead of a + b).

Postfix Notation Example

The infix expression (A + B) * (C - D) is written as AB+CD-* in postfix notation.

Notations for Arithmetic Expression Evaluation

Polish notation (prefix notation) and Reverse Polish notation (postfix notation) are used for efficiently evaluating arithmetic expressions without needing parentheses.

Arrays

An array is a collection of elements of the same data type stored contiguously in memory. Elements are accessed using their index (position).

Referencing Array Elements

Array elements are accessed using their index (starting from 0). A loop is typically used to access all the elements.

Multidimensional Arrays

Multidimensional arrays are arrays of arrays. A 2D array can be visualized as a matrix (rows and columns).

Memory Storage of 2D Arrays

  • Row-major order: Rows are stored contiguously in memory.
  • Column-major order: Columns are stored contiguously in memory.

Calculating the Address of an Element in a 2D Array

The address of an element a[i][j] in a 2D array is calculated as:

Row-Major Order

Address(a[i][j]) = BaseAddress + (i * numCols + j) * elementSize 
        
Column-Major Order

Address(a[i][j]) = BaseAddress + (j * numRows + i) * elementSize
        

Linked Lists

A linked list is a linear data structure where elements (nodes) are linked together. Each node contains data and a pointer to the next node.

Linked Lists: Linear or Non-Linear?

Linked lists are considered linear in terms of data access order but non-linear in terms of physical memory allocation (nodes can be scattered in memory).

Advantages of Linked Lists over Arrays

  • Dynamic size.
  • Efficient insertion and deletion of elements.
  • Memory efficiency (don't need contiguous memory).

Creating a Singly Linked List Node (C)

Syntax

struct node {
    int data;
    struct node *next;
};
        

Heterogeneous Linked Lists in C

For a heterogeneous linked list (containing different data types) in C, use a void* pointer to store pointers to various data types.

Doubly Linked Lists

Each node in a doubly linked list has pointers to both the next and previous nodes, allowing bidirectional traversal.

Inserting a Node at the Beginning of a Circular Singly Linked List (C)

C Code

// ... (C code for inserting a node at the beginning of a circular linked list would go here) ...
        

Queues (FIFO)

A queue is a First-In, First-Out (FIFO) data structure. Elements are added to the rear and removed from the front.

Applications of Queues

  • Managing tasks or requests (e.g., printer queue).
  • Buffering data in systems with varying data transfer rates.

Queue Implementation Drawbacks Using Arrays

Using arrays to implement queues has limitations:

  • Wasted Memory: Once an element is dequeued, the space it occupied cannot be immediately reused.
  • Fixed Size: Arrays have a fixed size; increasing the queue's capacity requires creating a new, larger array and copying elements.

Circular Queue Insertion Scenarios

Insertion in a circular queue depends on its current state:

  • If the queue is full ((rear + 1) % maxSize == front), an overflow occurs.
  • If the queue is not full and the rear pointer isn't at the end of the array, increment rear and insert at the new rear.
  • If the queue is not full and the rear pointer is at the end of the array, wrap around by setting rear to 0 and inserting.

Deques (Double-Ended Queues)

A deque (double-ended queue) allows insertion and deletion at both ends (front and rear).

Implementing Priority Queues Using Queues

You can implement a priority queue using at least two queues: one to store data and another to store priorities. More sophisticated data structures (like heaps) are often more efficient.

Tree Data Structures

A tree is a hierarchical data structure with a root node and zero or more subtrees. Trees are used to represent hierarchical relationships.

Types of Trees

  • General Tree
  • Forest (a collection of trees)
  • Binary Tree
  • Binary Search Tree (BST)
  • Expression Tree
  • Heap

Binary Trees

A binary tree is a tree where each node has at most two children (left and right).

In-Order Traversal of a Binary Tree (C)

C Code

void inorder(struct treenode *tree) {
    if (tree != NULL) {
        inorder(tree->left);
        printf("%d ", tree->data); // Assuming 'data' is a member of treenode
        inorder(tree->right);
    }
}
        

Maximum Nodes in a Binary Tree of Height k

The maximum number of nodes in a binary tree of height k is 2k+1 - 1 (where k ≥ 0).

Data Structure for Tree Construction

Queues are commonly used to implement tree traversals (like breadth-first search).

Counting Nodes in a Binary Tree (Recursive C Function)

C Code

int countNodes(struct node* node) {
    if (node == NULL) return 0;
    return 1 + countNodes(node->left) + countNodes(node->right);
}
        

Calculating the Height of a Binary Tree (Recursive C Function)

C Code

int treeHeight(struct node* node) {
    if (node == NULL) return 0;
    int leftHeight = treeHeight(node->left);
    int rightHeight = treeHeight(node->right);
    return 1 + max(leftHeight, rightHeight);
}
        

AVL Trees

AVL trees are self-balancing binary search trees that ensure all operations (search, insertion, deletion) have a time complexity of O(log n).

B-Tree Properties

  • Each node has a maximum number of children (order m).
  • Non-leaf nodes have at least ⌈m/2⌉ children (except for the root).
  • All leaf nodes are at the same level.

B-Tree vs. B+ Tree

B-Tree B+ Tree
Data can be stored in both internal and leaf nodes. Slower search. Data is stored only in leaf nodes. Faster search due to sequential access of leaf nodes.

Applications of Tree Data Structures

  • Representing hierarchical data.
  • Implementing decision trees.
  • Syntax analysis.

Graphs

Graph Data Structure

A graph is a data structure representing a set of vertices (nodes) and edges (connections) between them. Graphs model relationships between objects.

Cycle, Path, and Circuit in Graphs

  • Path: A sequence of connected vertices.
  • Cycle: A closed path (starting and ending at the same vertex).
  • Circuit: A closed path where vertices can be repeated.

Data Structures for Graph Implementation

  • Adjacency matrix: A 2D array representing connections.
  • Adjacency list: A list of linked lists representing connections.

Breadth-First Search (BFS) and Depth-First Search (DFS)

BFS and DFS are graph traversal algorithms that use queues and stacks, respectively, to explore the graph.

Applications of Graphs

  • Network routing.
  • Social network analysis.
  • Representing relationships.

Binary Search

Binary search is an efficient algorithm to find a specific element in a sorted array or list. It works by repeatedly dividing the search interval in half.

Advantages of Binary Search over Linear Search

Binary search has a time complexity of O(log n), significantly faster than linear search's O(n) complexity for large datasets. However, it requires a sorted list.

Advantages of Selection Sort

  • Simple implementation.
  • In-place sorting.
  • Generally outperforms bubble sort.

Multi-linked Structures

Multi-linked structures (like doubly linked lists) use multiple pointers in each node to establish connections, allowing for more complex data relationships and traversal patterns.

NULL vs. VOID in C

NULL VOID
A null pointer; represents the absence of a valid memory address. A data type specifier indicating the absence of a type.