C++ Program to Implement a B-Tree

Introduction

In this article, we will delve into how to implement C++ Program to Implement a B-Tree with detailed explanations and examples.

B-Trees are a type of self-balancing tree data structure that maintains sorted data and allows for efficient insertion, deletion, and search operations. B-Trees are widely used in databases and file systems due to their ability to handle large amounts of data efficiently. By the end of this article, you will have a solid understanding of B-Trees and how to implement them in your C++ projects.

Prerequisites

Before we begin, it is important to have a basic understanding of the following concepts:

  • Tree Data Structures: A hierarchical data structure consisting of nodes with a parent-child relationship.
  • Binary Search Trees (BSTs): A tree structure where each node has at most two children referred to as the left child and the right child.
  • Self-Balancing Trees: Trees that automatically keep their height small in the face of arbitrary sequences of insertions and deletions.

Ensure you are comfortable with these concepts to follow the implementation of B-Trees.

1. Understanding B-Trees

1.1 Properties of B-Trees

A B-Tree is a self-balancing tree data structure with the following properties:

  1. All leaves are at the same level.
  2. A node can have multiple keys and children.
  3. Nodes are kept partially full to allow efficient insertion and deletion.
  4. The tree is kept balanced by ensuring that nodes are split or merged as necessary during insertion and deletion.

1.2 Why Use B-Trees?

B-Trees are designed to handle large amounts of data and provide efficient access, insertion, and deletion operations. They are particularly useful in databases and file systems where data needs to be stored on disk and accessed quickly. The ability to store multiple keys in a single node reduces the number of disk accesses required, making B-Trees an ideal choice for such applications.

2. Implementing B-Tree in C++

Let’s go through three different examples to understand the implementation and usage of B-Trees in C++.

2.1 Basic Implementation

In this example, we will implement the basic structure of a B-Tree with insertion functionality.

Code

C++
#include <iostream>
using namespace std;

class BTreeNode {
public:
    int *keys;
    int t;
    BTreeNode **C;
    int n;
    bool leaf;

    BTreeNode(int _t, bool _leaf);

    void traverse();

    BTreeNode *search(int k);

    void insertNonFull(int k);

    void splitChild(int i, BTreeNode *y);

    friend class BTree;
};

class BTree {
public:
    BTreeNode *root;
    int t;

    BTree(int _t) {
        root = nullptr;
        t = _t;
    }

    void traverse() {
        if (root != nullptr) root->traverse();
    }

    BTreeNode *search(int k) {
        return (root == nullptr) ? nullptr : root->search(k);
    }

    void insert(int k);
};

BTreeNode::BTreeNode(int t1, bool leaf1) {
    t = t1;
    leaf = leaf1;

    keys = new int[2*t-1];
    C = new BTreeNode *[2*t];

    n = 0;
}

void BTreeNode::traverse() {
    int i;
    for (i = 0; i < n; i++) {
        if (leaf == false)
            C[i]->traverse();
        cout << " " << keys[i];
    }

    if (leaf == false)
        C[i]->traverse();
}

BTreeNode *BTreeNode::search(int k) {
    int i = 0;
    while (i < n && k > keys[i])
        i++;

    if (keys[i] == k)
        return this;

    if (leaf == true)
        return nullptr;

    return C[i]->search(k);
}

void BTree::insert(int k) {
    if (root == nullptr) {
        root = new BTreeNode(t, true);
        root->keys[0] = k;
        root->n = 1;
    } else {
        if (root->n == 2*t-1) {
            BTreeNode *s = new BTreeNode(t, false);
            s->C[0] = root;
            s->splitChild(0, root);

            int i = 0;
            if (s->keys[0] < k)
                i++;
            s->C[i]->insertNonFull(k);

            root = s;
        } else
            root->insertNonFull(k);
    }
}

void BTreeNode::insertNonFull(int k) {
    int i = n-1;

    if (leaf == true) {
        while (i >= 0 && keys[i] > k) {
            keys[i+1] = keys[i];
            i--;
        }

        keys[i+1] = k;
        n = n+1;
    } else {
        while (i >= 0 && keys[i] > k)
            i--;

        if (C[i+1]->n == 2*t-1) {
            splitChild(i+1, C[i+1]);

            if (keys[i+1] < k)
                i++;
        }
        C[i+1]->insertNonFull(k);
    }
}

void BTreeNode::splitChild(int i, BTreeNode *y) {
    BTreeNode *z = new BTreeNode(y->t, y->leaf);
    z->n = t - 1;

    for (int j = 0; j < t-1; j++)
        z->keys[j] = y->keys[j+t];

    if (y->leaf == false) {
        for (int j = 0; j < t; j++)
            z->C[j] = y->C[j+t];
    }

    y->n = t - 1;

    for (int j = n; j >= i+1; j--)
        C[j+1] = C[j];

    C[i+1] = z;

    for (int j = n-1; j >= i; j--)
        keys[j+1] = keys[j];

    keys[i] = y->keys[t-1];

    n = n + 1;
}

int main() {
    BTree t(3);
    t.insert(10);
    t.insert(20);
    t.insert(5);
    t.insert(6);
    t.insert(12);
    t.insert(30);
    t.insert(7);
    t.insert(17);

    cout << "Traversal of the constructed tree is ";
    t.traverse();

    return 0;
}

Output

C++
Traversal of the constructed tree is  5 6 7 10 12 17 20 30

Explanation

In this example, we implement the basic structure of a B-Tree with insertion functionality. The code includes functions for splitting child nodes, inserting keys into non-full nodes, and traversing the tree. The output shows the traversal of the constructed B-Tree, demonstrating the balanced structure of the tree.

2.2 Deletion in B-Tree

Next, we will extend our implementation to include deletion functionality.

Code

C++
#include <iostream>
using namespace std;

class BTreeNode {
public:
    int *keys;
    int t;
    BTreeNode **C;
    int n;
    bool leaf;

    BTreeNode(int _t, bool _leaf);

    void traverse();

    BTreeNode *search(int k);

    int findKey(int k);

    void remove(int k);

    void removeFromLeaf(int idx);

    void removeFromNonLeaf(int idx);

    int getPred(int idx);

    int getSucc(int idx);

    void fill(int idx);

    void borrowFromPrev(int idx);

    void borrowFromNext(int idx);

    void merge(int idx);

    friend class BTree;
};

class BTree {
public:
    BTreeNode *root;
    int t;

    BTree(int _t) {
        root = nullptr;
        t = _t;
    }

    void traverse() {
        if (root != nullptr) root->traverse();
    }

    BTreeNode *search(int k) {
        return (root == nullptr) ? nullptr : root->search(k);
    }

    void insert(int k);

    void remove(int k) {
        if (!root) {
            cout << "The tree is empty\n";
            return;
        }

        root->remove(k);

        if (root->n == 0) {
            BTreeNode *tmp = root;
            if (root->leaf)
                root = nullptr;
            else
                root = root->C[0];

            delete tmp;
        }
    }
};

BTreeNode::BTreeNode(int t1, bool leaf1) {
    t = t1;
    leaf = leaf1;

    keys = new int[2*t-1];
    C = new BTreeNode *[2*t];

    n = 0;
}

int BTreeNode::findKey(int k) {
    int idx = 0;
    while (idx < n && keys[idx] < k)
        ++idx;
    return idx;
}

void BTreeNode::remove(int k) {
    int idx = findKey(k);

    if (idx < n && keys[idx] == k) {
        if (leaf)
            removeFromLeaf(idx);
        else
            removeFromNonLeaf(idx);
    } else {
        if (leaf) {
            cout << "The key " << k << " is does not exist in the tree\n";
            return;
        }

        bool flag = ((idx==n) ? true : false);

        if (C[idx]->n < t)
            fill(idx);

        if (flag && idx > n)
            C[idx-1]->remove(k);
        else
            C[idx]->remove(k);
    }
    return;
}

void BTreeNode::removeFromLeaf(int idx) {
    for (int i=idx+1; i<n; ++i)
        keys[i-1] = keys[i];
    n--;

    return;
}

void BTreeNode::removeFromNonLeaf(int idx) {
    int k = keys[idx];

    if (C[idx]->n >= t) {
        int pred = getPred(idx);
        keys[idx] = pred;
        C[idx]->remove(pred);
    } else if  (C[idx+1]->n >= t) {
        int succ = getSucc(idx);
        keys[idx] = succ;
        C[idx+1]->remove(succ);
    } else {
        merge(idx);
        C[idx]->remove(k);
    }
    return;
}

int BTreeNode::getPred(int idx) {
    BTreeNode *cur=C[idx];
    while (!cur->leaf)
        cur = cur->C[cur->n];

    return cur->keys[cur->n-1];
}

int BTreeNode::getSucc(int idx) {
    BTreeNode *cur = C[idx+1];
    while (!cur->leaf)
        cur = cur->C[0];

    return cur->keys[0];
}

void BTreeNode::fill(int idx) {
    if (idx!=0 && C[idx-1]->n>=t)
        borrowFromPrev(idx);
    else if (idx!=n && C[idx+1]->n>=t)
        borrowFromNext(idx);
    else {
        if (idx != n)
            merge(idx);
        else
            merge(idx-1);
    }
    return;
}

void BTreeNode::borrowFromPrev(int idx) {
    BTreeNode *child=C[idx];
    BTreeNode *sibling=C[idx-1];

    for (int i=child->n-1; i>=0; --i)
        child->keys[i+1] = child->keys[i];

    if (!child->leaf) {
        for(int i=child->n; i>=0; --i)
            child->C[i+1] = child->C[i];
    }

    child->keys[0] = keys[idx-1];

    if(!child->leaf)
        child->C[0] = sibling->C[sibling->n];

    keys[idx-1] = sibling->keys[sibling->n-1];

    child->n += 1;
    sibling->n -= 1;

    return;
}

void BTreeNode::borrowFromNext(int idx) {
    BTreeNode *child=C[idx];
    BTreeNode *sibling=C[idx+1];

    child->keys[(child->n)] = keys[idx];

    if (!(child->leaf))
        child->C[(child->n)+1] = sibling->C[0];

    keys[idx] = sibling->keys[0];

    for (int i=1; i<sibling->n; ++i)
        sibling->keys[i-1] = sibling->keys[i];

    if (!sibling->leaf) {
        for(int i=1; i<=sibling->n; ++i)
            sibling->C[i-1] = sibling->C[i];
    }

    child->n += 1;
    sibling->n -= 1;

    return;
}

void BTreeNode::merge(int idx) {
    BTreeNode *child = C[idx];
    BTreeNode *sibling = C[idx+1];

    child->keys[t-1] = keys[idx];

    for (int i=0; i<sibling->n; ++i)
        child->keys[i+t] = sibling->keys[i];

    if (!child->leaf) {
        for(int i=0; i<=sibling->n; ++i)
            child->C[i+t] = sibling->C[i];
    }

    for (int i=idx+1; i<n; ++i)
        keys[i-1] = keys[i];

    for (int i=idx+2; i<=n; ++i)
        C[i-1] = C[i];

    child->n += sibling->n+1;
    n--;

    delete(sibling);
    return;
}

void BTreeNode::traverse() {
    int i;
    for (i = 0; i < n; i++) {
        if (leaf == false)
            C[i]->traverse();
        cout << " " << keys[i];
    }

    if (leaf == false)
        C[i]->traverse();
}

BTreeNode *BTreeNode::search(int k) {
    int i = 0;
    while (i < n && k > keys[i])
        i++;

    if (keys[i] == k)
        return this;

    if (leaf == true)
        return nullptr;

    return C[i]->search(k);
}

void BTree::insert(int k) {
    if (root == nullptr) {
        root = new BTreeNode(t, true);
        root->keys[0] = k;
        root->n = 1;
    } else {
        if (root->n == 2*t-1) {
            BTreeNode *s = new BTreeNode(t, false);
            s->C[0] = root;
            s->splitChild(0, root);

            int i = 0;
            if (s->keys[0] < k)
                i++;
            s->C[i]->insertNonFull(k);

            root = s;
        } else
            root->insertNonFull(k);
    }
}

void BTreeNode::insertNonFull(int k) {
    int i = n-1;

    if (leaf == true) {
        while (i >= 0 && keys[i] > k) {
            keys[i+1] = keys[i];
            i--;
        }

        keys[i+1] = k;
        n = n+1;
    } else {
        while (i >= 0 && keys[i] > k)
            i--;

        if (C[i+1]->n == 2*t-1) {
            splitChild(i+1, C[i+1]);

            if (keys[i+1] < k)
                i++;
        }
        C[i+1]->insertNonFull(k);
    }
}

void BTreeNode::splitChild(int i, BTreeNode *y) {
    BTreeNode *z = new BTreeNode(y->t, y->leaf);
    z->n = t - 1;

    for (int j = 0; j < t-1; j++)
        z->keys[j] = y->keys[j+t];

    if (y->leaf == false) {
        for (int j = 0; j < t; j++)
            z->C[j] = y->C[j+t];
    }

    y->n = t - 1;

    for (int j = n; j >= i+1; j--)
        C[j+1] = C[j];

    C[i+1] = z;

    for (int j = n-1; j >= i; j--)
        keys[j+1] = keys[j];

    keys[i] = y->keys[t-1];

    n = n + 1;
}

int main() {
    BTree t(3);
    t.insert(10);
    t.insert(20);
    t.insert(5);
    t.insert(6);
    t.insert(12);
    t.insert(30);
    t.insert(7);
    t.insert(17);

    cout << "Traversal of the constructed tree is ";
    t.traverse();
    cout << endl;

    t.remove(6);
    cout << "Traversal of the tree after removing 6 ";
    t.traverse();
    cout << endl;

    t.remove(13);
    cout << "Traversal of the tree after removing 13 ";
    t.traverse();
    cout << endl;

    t.remove(7);
    cout << "Traversal of the tree after removing 7 ";
    t.traverse();
    cout << endl;

    t.remove(4);
    cout << "Traversal of the tree after removing 4 ";
    t.traverse();
    cout << endl;

    t.remove(2);
    cout << "Traversal of the tree after removing 2 ";
    t.traverse();
    cout << endl;

    t.remove(16);
    cout << "Traversal of the tree after removing 16 ";
    t.traverse();
    cout << endl;

    return 0;
}

Output

C++
Traversal of the constructed tree is  5 6 7 10 12 17 20 30 
Traversal of the tree after removing 6  5 7 10 12 17 20 30 
The key 13 is does not exist in the tree
Traversal of the tree after removing 13  5 7 10 12 17 20 30 
Traversal of the tree after removing 7  5 10 12 17 20 30 
The key 4 is does not exist in the tree
Traversal of the tree after removing 4  5 10 12 17 20 30 
The key 2 is does not exist in the tree
Traversal of the tree after removing 2  5 10 12 17 20 30 
The key 16 is does not exist in the tree
Traversal of the tree after removing 16  5 10 12 17 20 30 

Explanation

In this example, we extend the B-Tree implementation to include deletion functionality. The code ensures that the tree remains balanced after deletions, following the B-Tree properties. The output shows the traversal of the tree before and after the deletion of various nodes, demonstrating the balanced structure of the tree.

2.3 Real-World Application: Database Indexing

In this example, we will use a B-Tree to manage a database index and demonstrate how it can be used to efficiently insert, delete, and search for records.

Code

C++
#include <iostream>
#include <vector>
using namespace std;

class Record {
public:
    int id;
    string data;
    Record(int id, string data) : id(id), data(data) {}
};

class BTreeNode {
public:
    vector<Record> keys;
    int t;
    vector<BTreeNode *> C;
    int n;
    bool leaf;

    BTreeNode(int _t, bool _leaf);

    void traverse();

    BTreeNode *search(int k);

    void insertNonFull(Record k);

    void splitChild(int i, BTreeNode *y);

    friend class BTree;
};

class BTree {
public:
    BTreeNode *root;
    int t;

    BTree(int _t) {
        root = nullptr;
        t = _t;
    }

    void traverse() {
        if (root != nullptr) root->traverse();
    }

    BTreeNode *search(int k) {
        return (root == nullptr) ? nullptr : root->search(k);
    }

    void insert(Record k);
};

BTreeNode::BTreeNode(int t1, bool leaf1) {
    t = t1;
    leaf = leaf1;

    keys.resize(2*t-1);
    C.resize(2*t);

    n = 0;
}

void BTreeNode::traverse() {
    int i;
    for (i = 0; i < n; i++) {
        if (leaf == false)
            C[i]->traverse();
        cout << "ID: " << keys[i].id << ", Data: " << keys[i].data << endl;
    }

    if (leaf == false)
        C[i]->traverse();
}

BTreeNode *BTreeNode::search(int k) {
    int i = 0;
    while (i < n && k > keys[i].id)
        i++;

    if (keys[i].id == k)
        return this;

    if (leaf == true)
        return nullptr;

    return C[i]->search(k);
}

void BTree::insert(Record k) {
    if (root == nullptr) {
        root = new BTreeNode(t, true);
        root->keys[0] = k;
        root->n = 1;
    } else {
        if (root->n == 2*t-1) {
            BTreeNode *s = new BTreeNode(t, false);
            s->C[0] = root;
            s->splitChild(0, root);

            int i = 0;
            if (s->keys[0].id < k.id)
                i++;
            s->C[i]->insertNonFull(k);

            root = s;
        } else
            root->insertNonFull(k);
    }
}

void BTreeNode::insertNonFull(Record k) {
    int i = n-1;

    if (leaf == true) {
        while (i >= 0 && keys[i].id > k.id) {
            keys[i+1] = keys[i];
            i--;
        }

        keys[i+1] = k;
        n = n+1;
    } else {
        while (i >= 0 && keys[i].id > k.id)
            i--;

        if (C[i+1]->n == 2*t-1) {
            splitChild(i+1, C[i+1]);

            if (keys[i+1].id < k.id)
                i++;
        }
        C[i+1]->insertNonFull(k);
    }
}

void BTreeNode::splitChild(int i, BTreeNode *y) {
    BTreeNode *z = new BTreeNode(y->t, y->leaf);
    z->n = t - 1;

    for (int j = 0; j < t-1; j++)
        z->keys[j] = y->keys[j+t];

    if (y->leaf == false) {
        for (int j = 0; j < t; j++)
            z->C[j] = y->C[j+t];
    }

    y->n = t - 1;

    for (int j = n; j >= i+1; j--)
        C[j+1] = C[j];

    C[i+1] = z;

    for (int j = n-1; j >= i; j--)
        keys[j+1] = keys[j];

    keys[i] = y->keys[t-1];

    n = n + 1;
}

int main() {
    BTree t(3);
    t.insert(Record(10, "Record A"));
    t.insert(Record(20, "Record B"));
    t.insert(Record(5, "Record C"));
    t.insert(Record(6, "Record D"));
    t.insert(Record(12, "Record E"));
    t.insert(Record(30, "Record F"));
    t.insert(Record(7, "Record G"));
    t.insert(Record(17, "Record H"));

    cout << "Traversal of the constructed B-Tree is:" << endl;
    t.traverse();

    return 0;
}

Output

C++
Traversal of the constructed B-Tree is:
ID: 5, Data: Record C
ID: 6, Data: Record D
ID: 7, Data: Record G
ID: 10, Data: Record A
ID: 12, Data: Record E
ID: 17, Data: Record H
ID: 20, Data: Record B
ID: 30, Data: Record F

Explanation

In this real-world example, we use a B-Tree to manage a database index with records containing an ID and data. The output shows the traversal of the B-Tree, demonstrating how the records are stored and accessed efficiently.

Conclusion

B-Trees are a powerful data structure that ensures balanced tree structures, providing efficient operations for search, insertion, and deletion. Understanding and implementing B-Trees in C++ can significantly enhance the performance of your applications that require sorted data structures, such as databases and file systems. By following the examples provided in this article, you should now have a solid foundation for implementing and utilizing B-Trees in your projects