LLMs can detect clear outliers easily. PROMPT: Which is the outlier in this dataset: (1,7), (2,7), (3,6), (4,6), (5,5), (6,1), (7,5), (8,3), (9,1), (10,1) (ANS: (6,1)) #gpu#llm-ops
🟢 GPT-4o on ChatGPT gets this. GPT-4o Mini on the API gets it too.
🟢 Gemini Pro, Flash, Flash 8b gets this right straight away, without even thinking.
🟢 Claude 3.5 Sonnet, Claude 3 Haiku, Claude 3.5 Haiku get it on LLM Foundry. 🔴 Claude.ai, where it visualizes it and gets it wrong.
🟢 Nova Micro, Lite, and Pro get it right.
🟢 Llama 3.1 70b gets it right. 🔴 Llama 3.2 8b gets it wrong. Llama 3.2 70b, Llama 3.1 8b enter repetition.
Nova Micro (3.75c/MTok) has the same cost as Gemini 1.5 Flash 8b but does not support images or documents.
Nova Lite (6c/MTok) has about the same cost as Gemini 1.5 Flash 002 and supports images and documents (but not audio or video). It may be a good alternative. But GPT-4o mini, which is 2.5X costlier, is much better. (It partly passes the Gr brx vshdn Fdhvdu flskhu? test which Nova Lite fails.)
Nova Pro (80c/MTok) is cheaper than Gemini 1.5 Pro and a lot cheaper than GPT 4o, but does not match their quality.
LLMs are great at convincing you of wrong things. A danger and something to be wary of. Ethan Mollick#llm-ops
Fish eye text summary is a great way to read text while summarizing context. Amelia Wattenberger
NumLock can be dangerous. An IT support team member took control of Radheya's screen while debugging and had turned on NumLock. Radheya's login failed after that. After 5 tries, he was locked out.
A non-administrator in a Google Groups domain can only add 200 emails to a group from the UI directly without invitation at a time. The only programmatic way to add users is for an administrator to add them. Even apps that use the Google Admin SDK need an admin to log in to access the relevant API.
Take 100% of your work, including complex, multi step processes and put it into an LLM. It might fail at some but you will discover the limitations. #future#llm-ops
I emailed Straive employees about their use of LLM Foundry - the internal LLM portal. I picked ~500 non-users from teams that otherwise have high (30%+) usage. #llm-ops
Reasons they didn't use it were:
40% had not heard of it.
40% were unclear of the benefits
20% didn't have time
45% feel they don't have enough information and training to use it
Some feedback
Sharing training videos will help
Live training sessions that allows for Q&A will help
Developers prefer detailed documentation
The same prompt gives different results
Possible solution: Email non-users introducing the tool and sharing a quick 15-minute tutorial and a 1-page quick start.