Hashing in C and C++

In this tutorial you will learn about Hashing in C and C++ with program example. You will also learn various concepts of hashing like hash table, hash function, etc.

Hashing in Data Structure

Searching is dominant operation on any data structure. Most of the cases for inserting, deleting, updating all operations required searching first. So searching operation of particular data structure determines it’s time complexity. If we take any data structure the best time complexity for searching is O (log n) in AVL tree and sorted array only. Most of the cases it will take O (n) time. To solve this searching problem hashing concept introduced which will take O (1) time for searching. It’s constant time.

Hash Table and Hash Function

Earlier when this concept introduced programmers used to create “Direct address table”. Direct address table means, when we have “n” number of unique keys we create an array of length “n” and insert element “i” at ith index of the array. That array is called Hash Table. But due to this method even we have 10 elements of each range 1 lack, we should create table of size 1 lack for only 10 elements. Which is going to be waste of memory.

To avoid this problem we fix the size of hash table (array) and map our elements into that table using a function, called Hash function. This function decides where to put a given element into that table. If we want to search also first apply hash function decide whether the element present in hash table or not.

Example

We have numbers from 1 to 100 and hash table of size 10. Hash function is mod 10. That means number 23 will be mapped to (23 mod 10 = 3) 3rd index of hash table.

Hashing, Hash Table, Hash Function

But problem is if elements (for example) 2, 12, 22, 32, elements need to be inserted then they try to insert at index 2 only. This problem is called Collision. To solve this collision problem we use different types of hash function techniques. Those are given below.

  1. Chaining
  2. Open addressing
    1. Linear probing
    2. Quadratic probing
    3. Double hashing

These also called collision resolution techniques.

Chaining

In hash table instead of putting one element in index we maintain a linked list. When collision happened we place that element in corresponding linked list. Here some space is wasted because of pointers.

Hashing 1

Open Addressing

In case if we have collision we again calculate the hash value using corresponding hash function. But this time we do some minor modifications to that input. This process of searching for empty space to insert element in called Probing.

Hashing 2

Probing hash function is: h (k, i)

here k is the key value which is to be inserted. And i is number of collision with that element.

Example: If we are inserting 2, we find its hash value using h (2, 0) because it’s first collision. Suppose the answer (index) to this function index already occupied we again need to apply h (2, 1) to hash function.

Linear Probing

Let hash function is h, hash table contains  0 to n-1 slots.

Now we want to insert an element k. Apply h (k). If it results “x” and the index “x” already contain a value then we again apply hash function that h (k, 1) this equals to (h (k) + 1) mod n.

General form: h1 (k, j) = (h (k) + j) mod n

Example: Let hash table of size 5 which has function is mod 5 has already filled at positions 0, 2, 3.

Now new element 10 will try to insert. 10 mod 5 = 0. But index 0 already occupied. So it checks (probes) next (index 1) position. So 10 will insert at index 1.

Now element 11 will try to insert. 11 mod 5 = 1. But index 1 already occupied, check index 2 it also occupied (data given), 3 also occupied. So it will insert at index 4 which is empty.

We can observe that it linearly checking for next empty position. So it’s called linear probing.

Problems with linear probing:

  • Primary clustering: There is a chance that continuous cells occupied, then the probability of new element insertion will get reduced. This problem is called primary clustering
  • Secondary clustering: If two elements get same value at first hash function, they follow same probe sequence.

Quadratic Probing

It is same as linear probing. But when collision occurs we use different function. If collision happened that element try to occupy at quadratic distance instead of linear distance.

Due to this “Primary clustering” will be reduced. But secondary clustering won’t be eliminated.

Double Hashing

Here we use two hash functions.

h1 (k) = (h1 (k) + i h2 (k)) mod n. Here h1 and h2 are two hash functions.

Here the next prob position will depend on two functions h1 and h2 also.

Advantages by this method are there is no chance of primary clustering. And also Secondary clustering also eliminated.

Chaining vs Open Addressing

Chaining Open Addressing
Elements can be stored at outside of the table In open addressing elements should be stored inside the table only
In chaining at any time the number of elements in the hash table may greater than the size of the hash table In open addressing the number of elements present in the hash table will not exceed to number of indices in hash table.
In case of deletion chaining is the best method If deletion is not required. Only inserting and searching is required open addressing is better
Chaining requires more space Open addressing requires less space than chaining.

Program for Hashing in C

Below is the implementation of hashing or hash table in C.

Output

Enter size of hash table

10

Enter hash function [if mod 10 enter 10]

10

Enter your choice

 1-> Insert

 2-> Delete

 3->Display

 4->Searching

 0->Exit

1

Enter key element to insert

12

Enter your choice

 1-> Insert

 2-> Delete

 3->Display

 4->Searching

 0->Exit

1

Enter key element to insert

22

Enter your choice

 1-> Insert

 2-> Delete

 3->Display

 4->Searching

 0->Exit

1

Enter key element to insert

32

Enter your choice

 1-> Insert

 2-> Delete

 3->Display

 4->Searching

 0->Exit

3

Index   Value

0       -2147483648

1       -2147483648

2       12

3       22

4       32

5       -2147483648

6       -2147483648

7       -2147483648

8       -2147483648

9       -2147483648

Enter your choice

 1-> Insert

 2-> Delete

 3->Display

 4->Searching

 0->Exit

2

Enter element to delete

12

Element deleted

 

Enter your choice

 1-> Insert

 2-> Delete

 3->Display

 4->Searching

 0->Exit

4

Enter element you want to search

32

Element found at index 4

Enter your choice

 1-> Insert

 2-> Delete

 3->Display

 4->Searching

 0->Exit

0

Program for Hashing in C++

Below is the implementation of hashing or hash table in C++.

Comment below if have queries or found anything incorrect in above tutorial for Hashing in C and C++.

Category: DSA

Leave a Reply

Your email address will not be published. Required fields are marked *