23 Oct 2023: Bits & Bytes, QLORA, llama.cpp, BigDL, are ways of quantizing models that you can run directly in Google Colab. Try them. Also try OpenVino. #ai-coding-tools#github#gpu#markdown
23 Oct 2023: Optimum is a set of performance optimization libraries for transformers #gpu#optimization