Experimenting with Llama 3.1 – 405B Model with 128k window size (8B and 7B)
Llama 3.1 small batch size weigh around 15 GB and consumed 15.03 GB of system memory. The medium batch size is over 16 GB in size and consumed 16.30 GB on my system. Small and medium batch sizes can be run only if you’ve at least 20GB RAM (running Windows or a little less if you’ve lightweight Linux distribution) and 6 GB GPU memory.
Read More