Requisitos previos: Introducción al hash y manejo de colisiones mediante enstringmiento separado
Cómo funciona el hash:
Para la inserción de un par clave (K) – valor (V) en un mapa hash, se requieren 2 pasos:
- K se convierte en un número entero pequeño (llamado su código hash) usando una función hash.
- El código hash se usa para encontrar un índice (hashCode% arrSize) y la lista completa enlazada en ese índice (enstringmiento separado) se busca primero para detectar la presencia de la K.
- Si se encuentra, su valor se actualiza y, si no, el par KV se almacena como un nuevo Node en la lista.
Complejidad y factor de carga
- Para el primer paso , el tiempo necesario depende de K y de la función hash.
Por ejemplo, si la clave es una string «abcd», entonces su función hash puede depender de la longitud de la string. Pero para valores muy grandes de n, el número de entradas en el mapa, la longitud de las claves es casi insignificante en comparación con n, por lo que se puede considerar que el cálculo hash tiene lugar en tiempo constante, es decir, O(1) . - Para el segundo paso , es necesario recorrer la lista de pares KV presentes en ese índice. Para esto, el peor de los casos puede ser que todas las n entradas estén en el mismo índice. Entonces, la complejidad del tiempo sería O(n) . Pero se ha investigado lo suficiente para hacer que las funciones hash distribuyan uniformemente las claves en la array, por lo que esto casi nunca sucede.
- Entonces, en promedio , si hay n entradas y b es el tamaño de la array, habrá n/b entradas en cada índice. Este valor n/b se llama factor de carga que representa la carga que hay en nuestro mapa.
- Este factor de carga debe mantenerse bajo, de modo que el número de entradas en un índice sea menor y la complejidad sea casi constante, es decir, O(1).
Refrito:
Como sugiere el nombre, repetir significa volver a hacer hash . Básicamente, cuando el factor de carga aumenta por encima de su valor predefinido (el valor predeterminado del factor de carga es 0,75), la complejidad aumenta. Entonces, para superar esto, el tamaño de la array aumenta (se duplica) y todos los valores se vuelven a codificar y se almacenan en la nueva array de tamaño doble para mantener un factor de carga bajo y una complejidad baja.
¿Por qué repetir?
El refrito se realiza porque cada vez que se insertan pares de valores clave en el mapa, el factor de carga aumenta, lo que implica que la complejidad del tiempo también aumenta, como se explicó anteriormente. Esto podría no dar la complejidad de tiempo requerida de O(1).
Por lo tanto, se debe hacer un refrito, aumentando el tamaño del bucketArray para reducir el factor de carga y la complejidad del tiempo.
¿Cómo se hace el Rehashing?
El refrito se puede hacer de la siguiente manera:
- Para cada adición de una nueva entrada al mapa, verifique el factor de carga.
- Si es mayor que su valor predefinido (o el valor predeterminado de 0,75 si no se proporciona), entonces Rehash.
- Para Rehash, cree una nueva array del doble del tamaño anterior y conviértala en la nueva array de cubetas.
- Luego, recorra cada elemento en el viejo bucketArray y llame al insert() para cada uno para insertarlo en el nuevo arreglo de cubos más grande.
Programa para implementar Rehashing:
Java
// Java program to implement Rehashing import java.util.ArrayList; class Map<K, V> { class MapNode<K, V> { K key; V value; MapNode<K, V> next; public MapNode(K key, V value) { this.key = key; this.value = value; next = null; } } // The bucket array where // the nodes containing K-V pairs are stored ArrayList<MapNode<K, V> > buckets; // No. of pairs stored - n int size; // Size of the bucketArray - b int numBuckets; // Default loadFactor final double DEFAULT_LOAD_FACTOR = 0.75; public Map() { numBuckets = 5; buckets = new ArrayList<>(numBuckets); for (int i = 0; i < numBuckets; i++) { // Initialising to null buckets.add(null); } System.out.println("HashMap created"); System.out.println("Number of pairs in the Map: " + size); System.out.println("Size of Map: " + numBuckets); System.out.println("Default Load Factor : " + DEFAULT_LOAD_FACTOR + "\n"); } private int getBucketInd(K key) { // Using the inbuilt function from the object class int hashCode = key.hashCode(); // array index = hashCode%numBuckets return (hashCode % numBuckets); } public void insert(K key, V value) { // Getting the index at which it needs to be inserted int bucketInd = getBucketInd(key); // The first node at that index MapNode<K, V> head = buckets.get(bucketInd); // First, loop through all the nodes present at that index // to check if the key already exists while (head != null) { // If already present the value is updated if (head.key.equals(key)) { head.value = value; return; } head = head.next; } // new node with the K and V MapNode<K, V> newElementNode = new MapNode<K, V>(key, value); // The head node at the index head = buckets.get(bucketInd); // the new node is inserted // by making it the head // and it's next is the previous head newElementNode.next = head; buckets.set(bucketInd, newElementNode); System.out.println("Pair(" + key + ", " + value + ") inserted successfully.\n"); // Incrementing size // as new K-V pair is added to the map size++; // Load factor calculated double loadFactor = (1.0 * size) / numBuckets; System.out.println("Current Load factor = " + loadFactor); // If the load factor is > 0.75, rehashing is done if (loadFactor > DEFAULT_LOAD_FACTOR) { System.out.println(loadFactor + " is greater than " + DEFAULT_LOAD_FACTOR); System.out.println("Therefore Rehashing will be done.\n"); // Rehash rehash(); System.out.println("New Size of Map: " + numBuckets + "\n"); } System.out.println("Number of pairs in the Map: " + size); System.out.println("Size of Map: " + numBuckets + "\n"); } private void rehash() { System.out.println("\n***Rehashing Started***\n"); // The present bucket list is made temp ArrayList<MapNode<K, V> > temp = buckets; // New bucketList of double the old size is created buckets = new ArrayList<MapNode<K, V> >(2 * numBuckets); for (int i = 0; i < 2 * numBuckets; i++) { // Initialised to null buckets.add(null); } // Now size is made zero // and we loop through all the nodes in the original bucket list(temp) // and insert it into the new list size = 0; numBuckets *= 2; for (int i = 0; i < temp.size(); i++) { // head of the chain at that index MapNode<K, V> head = temp.get(i); while (head != null) { K key = head.key; V val = head.value; // calling the insert function for each node in temp // as the new list is now the bucketArray insert(key, val); head = head.next; } } System.out.println("\n***Rehashing Ended***\n"); } public void printMap() { // The present bucket list is made temp ArrayList<MapNode<K, V> > temp = buckets; System.out.println("Current HashMap:"); // loop through all the nodes and print them for (int i = 0; i < temp.size(); i++) { // head of the chain at that index MapNode<K, V> head = temp.get(i); while (head != null) { System.out.println("key = " + head.key + ", val = " + head.value); head = head.next; } } System.out.println(); } //Function to get an element from Map public V getValue(K key){ //Get actual index of the key int actualIndex = getBucketInd(key); MapNode<K,V> temp = buckets.get(actualIndex); //Search for key in list while(temp != null){ if(temp.key == key){ return temp.value; } temp = temp.next; } return null; } } public class GFG { public static void main(String[] args) { // Creating the Map Map<Integer, String> map = new Map<Integer, String>(); // Inserting elements map.insert(1, "Geeks"); map.printMap(); map.insert(2, "forGeeks"); map.printMap(); map.insert(3, "A"); map.printMap(); map.insert(4, "Computer"); map.printMap(); map.insert(5, "Portal"); map.printMap(); //Get element from Map int key = 4; String value = map.getValue(key); System.out.println("Value at key "+ key +" is: "+ value); } }
Python3
# Python3 program to implement Rehashing class Map: class MapNode: def __init__(self,key,value): self.key=key self.value=value self.next=None # The bucket array where # the nodes containing K-V pairs are stored buckets=list() # No. of pairs stored - n size=0 # Size of the bucketArray - b numBuckets=0 # Default loadFactor DEFAULT_LOAD_FACTOR = 0.75 def __init__(self): Map.numBuckets = 5 Map.buckets = [None]*Map.numBuckets print("HashMap created") print("Number of pairs in the Map: " + str(Map.size)) print("Size of Map: " + str(Map.numBuckets)) print("Default Load Factor : " + str(Map.DEFAULT_LOAD_FACTOR) + "\n") def getBucketInd(self,key): # Using the inbuilt function from the object class hashCode = hash(key) # array index = hashCode%numBuckets return (hashCode % Map.numBuckets) def insert(self,key,value): # Getting the index at which it needs to be inserted bucketInd = self.getBucketInd(key) # The first node at that index head = Map.buckets[bucketInd] # First, loop through all the nodes present at that index # to check if the key already exists while (head != None): # If already present the value is updated if (head.key==key): head.value = value return head = head.next # new node with the K and V newElementNode = Map.MapNode(key, value) # The head node at the index head = Map.buckets[bucketInd] # the new node is inserted # by making it the head # and it's next is the previous head newElementNode.next = head Map.buckets[bucketInd]= newElementNode print("Pair(\" {} \", \" {} \") inserted successfully.".format(key,value)) # Incrementing size # as new K-V pair is added to the map Map.size+=1 # Load factor calculated loadFactor = (1* Map.size) / Map.numBuckets print("Current Load factor = " + str(loadFactor)) # If the load factor is > 0.75, rehashing is done if (loadFactor > Map.DEFAULT_LOAD_FACTOR): print(str(loadFactor) + " is greater than " + str(Map.DEFAULT_LOAD_FACTOR)) print("Therefore Rehashing will be done.") # Rehash self.rehash() print("New Size of Map: " + str(Map.numBuckets)) print("Number of pairs in the Map: " + str(Map.size)) print("Size of Map: " + str(Map.numBuckets)) def rehash(self): print("\n***Rehashing Started***\n") # The present bucket list is made temp temp = Map.buckets # New bucketList of double the old size is created buckets =(2 * Map.numBuckets) for i in range(2 * Map.numBuckets): # Initialised to null Map.buckets.append(None) # Now size is made zero # and we loop through all the nodes in the original bucket list(temp) # and insert it into the new list Map.size = 0 Map.numBuckets *= 2 for i in range(len(temp)): # head of the chain at that index head = temp[i] while (head != None): key = head.key val = head.value # calling the insert function for each node in temp # as the new list is now the bucketArray self.insert(key, val) head = head.next print("\n***Rehashing Ended***") def printMap(self): # The present bucket list is made temp temp = Map.buckets print("Current HashMap:") # loop through all the nodes and print them for i in range(len(temp)): # head of the chain at that index head = temp[i] while (head != None): print("key = \" {} \", val = {}" .format(head.key,head.value)) head = head.next print() if __name__ == '__main__': # Creating the Map map = Map() # Inserting elements map.insert(1, "Geeks") map.printMap() map.insert(2, "forGeeks") map.printMap() map.insert(3, "A") map.printMap() map.insert(4, "Computer") map.printMap() map.insert(5, "Portal") map.printMap() # This code is contributed by Amartya Ghosh
HashMap created Number of pairs in the Map: 0 Size of Map: 5 Default Load Factor : 0.75 Pair(1, Geeks) inserted successfully. Current Load factor = 0.2 Number of pairs in the Map: 1 Size of Map: 5 Current HashMap: key = 1, val = Geeks Pair(2, forGeeks) inserted successfully. Current Load factor = 0.4 Number of pairs in the Map: 2 Size of Map: 5 Current HashMap: key = 1, val = Geeks key = 2, val = forGeeks Pair(3, A) inserted successfully. Current Load factor = 0.6 Number of pairs in the Map: 3 Size of Map: 5 Current HashMap: key = 1, val = Geeks key = 2, val = forGeeks key = 3, val = A Pair(4, Computer) inserted successfully. Current Load factor = 0.8 0.8 is greater than 0.75 Therefore Rehashing will be done. ***Rehashing Started*** Pair(1, Geeks) inserted successfully. Current Load factor = 0.1 Number of pairs in the Map: 1 Size of Map: 10 Pair(2, forGeeks) inserted successfully. Current Load factor = 0.2 Number of pairs in the Map: 2 Size of Map: 10 Pair(3, A) inserted successfully. Current Load factor = 0.3 Number of pairs in the Map: 3 Size of Map: 10 Pair(4, Computer) inserted successfully. Current Load factor = 0.4 Number of pairs in the Map: 4 Size of Map: 10 ***Rehashing Ended*** New Size of Map: 10 Number of pairs in the Map: 4 Size of Map: 10 Current HashMap: key = 1, val = Geeks key = 2, val = forGeeks key = 3, val = A key = 4, val = Computer Pair(5, Portal) inserted successfully. Current Load factor = 0.5 Number of pairs in the Map: 5 Size of Map: 10 Current HashMap: key = 1, val = Geeks key = 2, val = forGeeks key = 3, val = A key = 4, val = Computer key = 5, val = Portal