Sun, Dec 8, 2024 | Anand S - Things I Learned

NumLock can be dangerous. An IT support team member took control of Radheya's screen while debugging and had turned on NumLock. Radheya's login failed after that. After 5 tries, he was locked out.

With LLMs, most architectural decisions are no longer one-way doors. Steve Yegge #future #llm-ops

To install Docker on Windows without admin privileges, use net localgroup docker-users "your-user-id" /ADD

A non-administrator in a Google Groups domain can only add 200 emails to a group from the UI directly without invitation at a time. The only programmatic way to add users is for an administrator to add them. Even apps that use the Google Admin SDK need an admin to log in to access the relevant API.

Take 100% of your work, including complex, multi step processes and put it into an LLM. It might fail at some but you will discover the limitations. #future #llm-ops

I emailed Straive employees about their use of LLM Foundry - the internal LLM portal. I picked ~500 non-users from teams that otherwise have high (30%+) usage. #llm-ops
- Reasons they didn't use it were:
  - 40% had not heard of it.
  - 40% were unclear of the benefits
  - 20% didn't have time
- 45% feel they don't have enough information and training to use it
- Some feedback
  - Sharing training videos will help
  - Live training sessions that allows for Q&A will help
  - Developers prefer detailed documentation
  - The same prompt gives different results
- Possible solution: Email non-users introducing the tool and sharing a quick 15-minute tutorial and a 1-page quick start.

ChatGPT uses several unusual unicode characters for citations. Ref #chatgpt

The cost of intelligence is trending to zero. How do we plan for this? Logan Kilpatrick
- If you are not planning for the price of intelligence to go to zero, the next 3-5 years are going to incredibly disruptive to your business / life.
- The important but not stated caveat: consumer willingness to pay for AI is going to go up (a lot). It will be fascinating to watch consumer willingness, cost, and the amount of AI being used all move in different directions.
- Everyone building things with AI has an economic incentive to limit the amount of AI because of cost, which inherent limits the value prop. This will change as intelligence goes up and cost goes down.
- What this means is:
  - Admin automation: Administrative tasks vanish into background AI. Booking meetings, managing finances, or even planning family activities will require less thought.
  - Hyper-personalization: Individuals get tailor-made everything—from medical advice to product recommendations to daily schedules. Systems learn your quirks.
  - AI co-brains: AI co-worker “assistants support you at any moment. Productivity soars in knowledge work. “I’ll have my AI follow up becomes a normal response.
  - Humanity valued more: As AI handles rote tasks, humans move up the value chain, focusing on creativity, empathy, or the “last-mile decisions.
  - New business models:
    - AI experts as a service
    - Embedded AI Solutions
    - AI micro-services for smart-calls
    - Distributed AI

Arena Hard is a set of hard prompts to test LLMs. Here is the code and evaluation #ai-coding-tools #llm-ops

LLMs can detect clear outliers easily. PROMPT: Which is the outlier in this dataset: (1,7), (2,7), (3,6), (4,6), (5,5), (6,1), (7,5), (8,3), (9,1), (10,1) (ANS: (6,1)) #gpu #llm-ops
- 🟢 GPT-4o on ChatGPT gets this. GPT-4o Mini on the API gets it too.
- 🟢 Gemini Pro, Flash, Flash 8b gets this right straight away, without even thinking.
- 🟢 Claude 3.5 Sonnet, Claude 3 Haiku, Claude 3.5 Haiku get it on LLM Foundry. 🔴 Claude.ai, where it visualizes it and gets it wrong.
- 🟢 Nova Micro, Lite, and Pro get it right.
- 🟢 Llama 3.1 70b gets it right. 🔴 Llama 3.2 8b gets it wrong. Llama 3.2 70b, Llama 3.1 8b enter repetition.

My notes on the Amazon Nova models. More on Hacker News
- Nova Micro (3.75c/MTok) has the same cost as Gemini 1.5 Flash 8b but does not support images or documents.
- Nova Lite (6c/MTok) has about the same cost as Gemini 1.5 Flash 002 and supports images and documents (but not audio or video). It may be a good alternative. But GPT-4o mini, which is 2.5X costlier, is much better. (It partly passes the Gr brx vshdn Fdhvdu flskhu? test which Nova Lite fails.)
- Nova Pro (80c/MTok) is cheaper than Gemini 1.5 Pro and a lot cheaper than GPT 4o, but does not match their quality.

LLMs are great at convincing you of wrong things. A danger and something to be wary of. Ethan Mollick #llm-ops

Fish eye text summary is a great way to read text while summarizing context. Amelia Wattenberger #speech-to-text