Richard’s Blog

Richard’s Blog

Observations on what's around me and projects I'm working on.

Posts tagged with llms

Show all posts · RSS feed for these posts

Freelancers and LLMs: Expertise, judgement and trust

Reading: The more commodified your job, the more likely AI can do it – lessons from online freelancing, The Conversation, 9 April 2026. Upwork is one of the online freelancer marketplaces. They’ve reported that ChatGPT has resulted in a decrease in low-value work, and an increase in demand for high-value work (contracts...

17 Apr 2026 Read more…

Good luck, Rosie

There's a glorious story of the first computationally designed personal mRNA cancer vaccine for a dog. A civilian with no cancer or biology training—but with AI experience—used LLMs to plan and help design a cancer vaccine for his dog. Come on! This is wonderful. It's doubly wonderful because of his persistence and the...

24 Mar 2026 Read more…

Human/Health-AI communication loop

Reading: Are AI Tools Ready to Answer Patients’ Questions About Their Medical Care?, JAMA, 6 March 2026. OpenAI says that ChatGPT Health is built “to guide patients to health care professionals for diagnosis and treatment.” Unfortunately it currently under-triages emergency conditions and over-triages non-emergency...

16 Mar 2026 Read more…

Synthetic market research

From the first appearance of LLMs, the psychology and behaviour community has asked: "Can I use this tech to replace human subjects?" It's now taken off in market research. There are lots of companies doing this, and I suspect that’s partly because it's not that hard in principle. The intuition is that LLMs, by...

06 Mar 2026 Read more…

LLM medical device regulation 🦆

Reading: If a therapy bot walks like a duck and talks like a duck then it is a medically regulated duck, npj Digital Medicine, 5 December 2025. Medical device regulation depends on the claims and intended purpose of the device—amongst other things. In the case of Claude Sonnet, the system prompt suggests it does have an...

29 Jan 2026 Read more…

AI in 2026

Of all the hot takes on what will happen to AI in 2026, I'm drawn to this one from Philip Ball: It’s unlikely that 2026 will be a make-or-break time for AI – many aspects of it are here to stay – but there could be turbulence for the industry, particularly if the investment bubble bursts, as many anticipate....

05 Jan 2026 Read more…

Assisted thinking with provocations

Watching: "How to stop AI from killing your critical thinking" from Advait Sarkar of Microsoft. There's a nice demo around 7 minutes in. The set-up is someone who needs to understand some written materials and write a report. Instead of throwing the task at an LLM, the idea is that an LLM should challenge you as you...

12 Dec 2025 Read more…

Paying for content for AI

The point is a good one: If you want to get chips from NVIDIA, Jensen [the CEO] makes you pay for them [...] If you want to get the best researchers, they get salaries. We're not going to say "oh, AI is so important that we're going to bring back slavery". That doesn't make any sense. So I don't understand why if the...

04 Nov 2025 Read more…

Making LLMs safer for health care support

This is a hard, perhaps impossible, problem to fix as it stands. Limbic are having a go a it.Reading: The Limbic Layer: Transforming Large Language Models (LLMs) into Clinical Mental Health Experts, PsyArXive Pre-print, 26 August 2025.Limbic have a Class IIa medical device for psychological assessment, and are also...

13 Oct 2025 Read more…

The 75th anniversary of the Turing Test

On Thursday I tuned into the Royal Society's live YouTube broadcast celebrating 75 years since the publication of what we now call the Turing Test.Out of those five hours, I'd highlight Professor Sarah Dillon on a panel discussion (58 minutes in). She pointed out that Computing Machinery and Intelligence is "really...

05 Oct 2025 Read more…

Medical LLM pattern matching is brittle

LLMs do well on medical benchmarks, but this nice experiment blows a hole in that. Reading: Fidelity of Medical Reasoning in Large Language Models, JAMA Network, 8 August 2025. The set up The experiment presents an LLM with a medical scenario and prompts it to pick the correct answer from a set of 5 options. LLMs do...

21 Aug 2025 Read more…

LLMs are almost as good as bespoke clinical diagnosis systems

Reading: Dedicated AI Expert System vs Generative AI With Large Language Model for Clinical Diagnoses, JAMA Network Open, 29 May 2025. I’d assume an expert system for clinical diagnosis, developed over 40 years, would perform better than an off-the-shelf consumer LLM, like ChatGPT. It does! But only just, and the...

07 Jun 2025 Read more…

GitHub copilot PR review is useful

On GitHub pull requests, there’s the option to ask their LLM for a code review. It’s available to GitHub Pro users (US$48/year at the time of writing) and in organizations ($19/user/month). It has limited language support today. I’ve been using it against TypeScript and HTML/React, and I’ve found it useful. It has made...

16 May 2025 Read more…

Local LLM crib sheet

My reminder of my local LLM set up as of May 2025. Ollama Models are stored in ~/.ollama/models, but it’s easiest to manage with ollama ls and then ollama rm. Example: ollama run gemma3 or ollama run olmo2. Integration with other projects is the main reason I have Ollama installed. Smart cat sc is an LLM shell command...

05 May 2025 Read more…

What does a “1.3% hallucination rate” mean?

Reading: AI hallucinations can’t be stopped — but these techniques can limit their damage, Nature, 21 January 2025. The part of the report that caught my eye was a chart showing the best models having a “hallucination rate of 1.3%": What does hallucination rate mean? The conclusion I’ve come to—and I’ll show my...

18 Feb 2025 Read more…