Sun, Jan 26, 2025

Weekly Notes and Discoveries | S Anand

Something I learned from a Sikkil Gurucharan concert.
- Make the subject of your talk the hero. Not yourself. Be a fan. Share your enthusiasm
- Get into the zone while presenting.

Something I learnt from Aboorva Singeetham:
- Kamal Hassan: "A farmer invests in crops. I'm an actor. So I invest in films." As a technologist, I guess I would invest in technology.
- "A person who has much more to give is unfazed by overwhelming demands because there is too much in him to overwhelm. He gives you 2 options in place of one."

According to Portkey's LLM usage analysis #cloud #gpu
- Anyscale and Fireworks AI have the lowest error rates (5xx, 429) and rate limits across providers
- Groq and Anthropic are among the highest, OpenAI is among the lowest, Google is in-between
- OpenAI has lower error rates and lower latency than Azure
- They have a ~35% cache hit rate

A few quick points supporting the mental model of "LLMs are aliens". #llm-ops
- LLMs are clearly not machines. They give different answers each time.
- LLMs are like humans: they exhibit human biases (e.g. guessing 42 or 37 often). But they fail in unusual ways. They can't count the "r"s in strawberry. They can go into an endless loop.
- LLMs are a new form of intelligence. Thinking of them as aliens might minimize our confusions.

If you put LLMs in a feedback loop, it can optimize for its reward function by emotionally pushing people, generating misinformation, nudging towards a narrow definition of creativity, etc.: https://bsky.app/profile/emollick.bsky.social/post/3lg4darqwfc2d #llm-ops #optimization

ChatGPT's Scheduled Tasks are pretty bad at fetching the latest news. Its use of search is poor. (I'm not sure if it actually searches.) I need to figure out other use cases for it. Possible options are: #chatgpt

DeepSeek does not enforce rate limits. Yet another reason to switch to DeepSeek. (via Simon Willison). My other reasons are:
- Claude 3.5 Sonnet-level coding capability at 5% of the cost (soon to be 2.5%)
- Prompt caching by default
- Fill in the middle completion