Skip to content

Apply web search citation stripping for GPT-5.x models in OpenAI conversation#170956

Open
frenck wants to merge 1 commit into
devfrom
frenck-2026-0554
Open

Apply web search citation stripping for GPT-5.x models in OpenAI conversation#170956
frenck wants to merge 1 commit into
devfrom
frenck-2026-0554

Conversation

@frenck
Copy link
Copy Markdown
Member

@frenck frenck commented May 17, 2026

Proposed change

When using GPT-5.x models with web_search enabled and inline_citations disabled, citations like ([legaseriea.it](https://...)) were still included in responses.

The regex-based citation stripping was guarded by "reasoning" not in model_args, which excluded all reasoning models. However, only o-series models (o1, o3, etc.) natively respect the prompt instruction to omit citations. GPT-5.x models do not, so they need the regex fallback.

This changes the guard to not model_args["model"].startswith("o") so that citation stripping is applied to all non-o-series models, including GPT-5.x.

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Deprecation (breaking change to happen in the future)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

Checklist

  • I understand the code I am submitting and can explain how it works.
  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • I have followed the perfect PR recommendations
  • The code has been formatted using Ruff (ruff format homeassistant tests)
  • Tests have been added to verify that the new code works.
  • Any generated code has been carefully reviewed for correctness and compliance with project standards.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • For the updated dependencies a diff between library versions and ideally a link to the changelog/release notes is added to the PR description.

To help with the load of incoming pull requests:

Copilot AI review requested due to automatic review settings May 17, 2026 09:20
@frenck frenck requested a review from Shulyaka as a code owner May 17, 2026 09:20
@home-assistant home-assistant Bot added bugfix cla-signed has-tests integration: openai_conversation small-pr PRs with less than 30 lines. Top 200 Integration is ranked within the top 200 by usage labels May 17, 2026
@home-assistant
Copy link
Copy Markdown
Contributor

Hey there @Shulyaka, mind taking a look at this pull request as it has been labeled with an integration (openai_conversation) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of openai_conversation can trigger bot actions by commenting:

  • @home-assistant close Closes the pull request.
  • @home-assistant mark-draft Mark the pull request as draft.
  • @home-assistant ready-for-review Remove the draft status from the pull request.
  • @home-assistant rename Awesome new title Renames the pull request.
  • @home-assistant reopen Reopen the pull request.
  • @home-assistant unassign openai_conversation Removes the current integration label and assignees on the pull request, add the integration domain after the command.
  • @home-assistant update-branch Update the pull request branch with the base branch.
  • @home-assistant add-label needs-more-information Add a label (needs-more-information, problem in dependency, problem in custom component, problem in config, problem in device, feature-request) to the pull request.
  • @home-assistant remove-label needs-more-information Remove a label (needs-more-information, problem in dependency, problem in custom component, problem in config, problem in device, feature-request) on the pull request.

@frenck frenck added this to the 2026.5.3 milestone May 17, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes OpenAI Conversation web search responses incorrectly including Markdown inline citations for GPT‑5.x models when inline_citations is disabled, by applying the regex citation-stripping fallback to all non‑o‑series models.

Changes:

  • Change the citation-stripping guard from “not a reasoning model” to “not an o‑series model” so GPT‑5.x gets the regex fallback.
  • Add a regression test ensuring citations are stripped for GPT‑5 models when web_search is enabled and inline_citations is disabled.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
homeassistant/components/openai_conversation/entity.py Applies citation stripping for web search to all non‑o‑series models (including GPT‑5.x).
tests/components/openai_conversation/test_conversation.py Adds test coverage to validate citation stripping behavior for GPT‑5 models.

@Shulyaka
Copy link
Copy Markdown
Contributor

Let me check if we can do this with a better prompt

@frenck frenck modified the milestones: 2026.5.3, 2026.5.4 May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix cla-signed has-tests integration: openai_conversation Quality Scale: bronze small-pr PRs with less than 30 lines. Top 200 Integration is ranked within the top 200 by usage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenAI web search: citation stripping not applied to GPT-5.x models

4 participants