#impossible | Anand S - Things I Learned

Thu, Jun 26, 2025. There are several brands with recognizable chart style guides. It's possible to generate style guides for these from the charts, but applying them via matplotlib is almost today. ChatGPT #chatgpt #impossible

Thu, Apr 3, 2025. ChatGPT can't yet create good sketchnotes. Here's the impact of US tariffs on India. ChatGPT #chatgpt #impossible

Wed, Mar 5, 2025. Reliably creating interactive tutorials is hard today. Claude 3.7 Sonnet ran out of tokens when I tried creating an interactive tutorial on diffraction. Cursor got the tokens but failed to get the application right after 3 attempts. This is not yet reliable, and when it does become reliable, education will change a fair bit. #tts #impossible

Tue, Oct 22, 2024. Gemini sort-of supports diarization. Ref. I tried it and it's OK but not perfect. #impossible
- LLMs cannot diarize reliably yet. (Gemini just guesses the speaker differences.)

Wed, Aug 7, 2024. Notes from a 4-hour flight: #llm-ops #prompt-engineering #try #impossible
- What We’ve Learned From A Year of Building with LLMs
  - Strategy
    - IS IT TOO HARD/EXPENSIVE? Log it. LLMs are getting cheaper and better.
    - WILL OPENAI BUILD IT? If so, wait for it instead of building.
    - HAS A STARTUP BUILT IT? If so, use it instead. It's a generic use case there's no point re-inventing.
    - FOCUSED USE CASES over generic. Build trust by starting small.
    - Tools for LLM Ops (feedback): LangSmith, Log10, LangFuse, W&B Weave, HoneyHive
    - Human in the Loop is about humans evaluating model outputs. That's different from AI in the loop, human in the center, where AI accelerates human output (like Github Copilot)
  - Operations
    - CHECK EMBEDDINGS DRIFT over time. Users might be input-ing different things than before.
    - LOG AND REVIEW everything.
    - Instructor coaxes structured output from LLM APIs.
    - IMPLICIT FEEDBACK collection is easy. Just let users edit stuff.
  - Tactical
    - Try n-shot prompting (n=5-12) before bigger models.
    - Always structure for output: Markdown, XML/HTML tags.
    - Combine RAG with Keyword search. It reduces user frustration in edge cases.
    - Prefer multiple small prompts to one big prompt. Do X. Then Y. Then Z.
    - Jitter prompts for diversity beyond temperature.
    - LLM-as-judge works better when comparing outputs (not rating 1 output). Keep length similar (LLMs prefer wordiness). Swap order and compare. Allow for ties. Ask for reason FIRST.
- Hermes: A Text-to-SQL solution at Swiggy
  - "Hermes performed significantly better for charters with well-defined metadata and a relatively smaller number of tables."
  - "We collect feedback on the accuracy of the returned query from stakeholders directly within the Slack bot."
- How I use AI and "Replacing my right hand with AI"
  - EMBED in every app/workflow. E.g. Auto-fix spellings. Auto-review code. Auto-ask LLM on errors and apply patch! Auto-search for answer, assess, continue.
  - PERSIST. Stick with the LLM to the end. Don't fix it yourself. It's faster.
    - INTERVENE FAST. If an LLM can't solve it by itself in 2 tries, it needs in-depth help.
  - APP-IFY one-off tasks. Disposable tools. "Write web-app to convert JSON to tab-delimited." "Extract fields as a table." "Diff JSON."
  - BEST language/frameworks preferred. CUDA in Python. Rust. C. Raspberry Pi. Arduino. Bluetooth. Modern ESM/JS.
  - TEACH examples. "Here's the LLM Foundry API." "Here's how to use gramex.data."
  - DUMP entire code. Models can handle it. Refactoring to SQLAlchemy 2, Pandas 2. API Documentation. Test case generation.
  - ASK for features & packages. Docker without root access. GPU access inside docker. Windows CLI-only C++ compiler.
  - TEST CASE writing.
  - SPEC IN DETAIL. Use these libraries. Write like this: code example.
  - SPEC USAGE in detail.
    - "I will just pipe it into sqlite", or "I will just run ffmpeg -i filename [YOUR OPTIONS].
    - Describe the UI, API input/output, data structure, and internal data structure.
  - HELP on usage. "ffmpeg to get audio.mp3".
- My benchmark for large language models
  - LLM(text) is a useful function to have in JS and Python too. Useful as a simple pip install llmfoundry
  - Allow images, files in LLM()
- Current list of (or hard) things for LLMs
  - Translate technical documents to Dutch -- because they don't understand the technical terms well
  - Translate large documents (JSON to XML, English to Chinese, Python to Rust, Wrong to right spelling) -- because the output tokens are limited

Sun, Jul 14, 2024. LLMs cannot provide a bounding box of objects in images. (Maybe Florence 2 can). Update: Mar 2025. Gemini has good timestamps and bounding boxes #llm-ops #impossible