Add Logprobs to Openai Structured Output

When working with LLMs sometimes you want to know if the response you’re getting from the model is the one that at least the model itself is sort of confident about. For example, I recently worked on classifying pull requests into categories like “feature”, “bugfix”, “infrastructure”, etc with LLMs, and as part of the process we wanted to know how many categories should we assign for each PR. We were interested in assigning any number of categories that are relevant to the PR (a PR can be both a “bugfix” and “infrastructure”). It’s hard to get a proper confidence score from an LLM, but logprobs probably is the closest we can get. The problem is, in a structured response generation (e.g. when you prompt the model to generate its response in a JSON format), you’re only interested in the logprobs of the values, not everything. In the example generation below, we are only interested in the logprobs of “bugfix”, “testing”, and “infrastructure”, but not “primary_category”, etc: ...

2025-03-03 · 3 min

Pydantic Logfire for LLM and API Observability

I’ve been using sentry for automatically logging the errors and exceptions of my python projects. A few months ago I needed to log some information if a specific condition is true in my side project’s backend, but I wasn’t able to do this with sentry. It apparently can only work when something fails, and you can’t capture log messages if there’s no failure or exception. I looked for an affordable and user friendly observability tool and settled on using axiom . It has a generous 500GB ingestion on free tier plan, but you can only view the events for the past 30 days time period. So I’ve been exporting the logs every month into a csv file, since I want to be able to view the trend of some behaviours over time. ...

2024-12-19 · 2 min

Access Google Gemini LLM via OpenAI Python Library

Google Gemini now can be accessed via OpenAI python library: from openai import OpenAI client = OpenAI( api_key="GEMINI_API_KEY", base_url="https://generativelanguage.googleapis.com/v1beta/openai/" ) ## rest of the code as you would use openai It support basic text generation, image input, function calling, structured output, and embeddings. More info and code examples can be found on Gemini docs .

2024-12-01 · 1 min

Lessons After a Half Billion Gpt Tokens

Ken writes about the lessons they’ve learned building new LLM-based features into their product. When it comes to prompts, less is more Not enumerating an exact list or instructions in the prompt produces better results, if that thing was already common knowledge. GPT is not dumb, and it actually gets confused if you over-specify. This has been my experience as well. For a recent project, I first started with a very long and detailed prompt, asking the LLM to classify a text and produce a summary. GPT-4, GPT-3.5, Claude-3-Opus, and Claude-3-Haiku all performed average or poorly. I then experimented with shorter prompts, and with some adjustments I was able to get much better responses with a very much shorter prompt. ...

2024-05-27 · 2 min