Skip to content

Conversation

giuliohome
Copy link

Problem

When Swagger UI accesses /docs/ endpoint, it calls with_base_path() which triggers clone(). The current implementation uses copy.deepcopy(self._spec) (processed spec) instead of the original raw specification, causing validation failures that don't occur during app startup.

Root Cause

The clone() method at line 207 creates a new instance from self._spec (already processed and resolved) rather than self._raw_spec (original). The deepcopy operation can corrupt internal references and circular dependencies in the processed spec.

Fix

# Before
def clone(self):
    return type(self)(copy.deepcopy(self._spec))

# After  
def clone(self):
    return type(self)(copy.deepcopy(self._raw_spec))

Fixes #2078

@giuliohome
Copy link
Author

Why Validating Processed Specs is Conceptually Wrong

Technical Analysis by Claude AI

The Core Issue

The clone() method validates a processed specification (_spec) instead of the original specification (_raw_spec). This is fundamentally flawed because:

What happens during resolve_refs()

When Connexion processes a raw OpenAPI spec, it:

  1. Resolves all $ref references
  2. Inlines referenced schemas directly into the spec
  3. Changes the structural context of schema elements

Why this breaks validation

Example: Valid Raw Spec

# This passes OpenAPI 3.0 validation
additionalProperties:
  $ref: '#/components/schemas/SomeSchema'
  propertyNames:           # Ignored - not part of $ref validation
    pattern: '^[a-z]+$'

After resolve_refs() Processing

# This FAILS OpenAPI 3.0 validation  
additionalProperties:
  type: object              # Inlined from resolved $ref
  properties: {...}         # Inlined schema content
  propertyNames:            # Now INVALID - OpenAPI 3.0 doesn't support this
    pattern: '^[a-z]+$'

The Conceptual Problem

Raw spec validation checks: "Is this a valid OpenAPI document?"
Processed spec validation checks: "Is this internal representation valid?"

These are fundamentally different questions with different answers.

Real-World Impact

  • Raw spec: Contains $ref + extra fields → Valid (extra fields ignored)
  • Processed spec: Contains inlined schemas + same extra fields → Invalid (extra fields now in wrong context)

The same logical specification becomes invalid purely due to internal processing artifacts.

The Fix

def clone(self):
    return type(self)(copy.deepcopy(self._raw_spec))  # Always validate the original

This ensures clone() always produces a validatable OpenAPI specification, not a corrupted internal representation.

Why This Matters

When Swagger UI calls with_base_path()clone(), it expects to receive a valid OpenAPI spec that can be re-processed and displayed. Giving it a pre-processed spec violates this contract and causes validation failures.


The fundamental principle: Only validate what the user provided, never validate internal processing artifacts.

@chrisinmtown
Copy link
Contributor

I think this might almost be a duplicate of PR #1889? That PR has been ignored by the maintainers for a long time. Now you have come along and done heroic work in figuring it out all over again! Until this fix or something like it gets merged, V3 remains fundamentally broken. I just don't know how to raise the priority of this in the queue.

@giuliohome
Copy link
Author

Thank you @chrisinmtown, let me see if I can propose a slight modification to account for external refs

giuliohome and others added 2 commits September 6, 2025 16:39
Fix clone() to use appropriate spec based on ref types

Use _raw_spec for internal refs only, _spec for external refs.
Prevents validation failures from resolved schema artifacts while
maintaining compatibility with relative refs tests.

Fixes spec-first#2078
@giuliohome
Copy link
Author

New commit pushed with conditional cloning approach

I've updated the implementation to address the concerns about external references from PR #1889. The new approach:

  • For specs with only internal refs (#/components/...): Uses _raw_spec to avoid validation artifacts from resolved schemas
  • For specs with external refs (file.yaml#/...): Uses _spec to preserve resolved external content

This should maintain compatibility with the existing relative_refs tests while fixing the Swagger UI validation issue for specs with internal references only.

The _has_only_internal_refs() helper safely detects reference types by scanning for external file patterns (.yaml, .json) and non-fragment refs.

@RobbeSneyders This approach should address your previous concerns about test failures while still solving the core validation inconsistency issue.

@giuliohome
Copy link
Author

giuliohome commented Sep 6, 2025

To be honest, I see all 802 tests passing (in my local WSL) even with my original one-line fix, but I added the conditional handler to be on the safe side.

Testing Results

During testing, I verified that the clone() method is always called with internal references only, confirming that all 802 tests pass using _raw_spec.

The conditional handler was added for safety to handle potential external reference cases not covered by the current test suite.

Key Findings:

  • ✅ All 802 tests pass with the original one-line fix
  • _has_only_internal_refs() returns True for all test cases
  • ✅ No external references are used in the current test suite
  • ✅ The conditional approach provides defensive coding for edge cases

- Adds test_clone_external_refs.py to validate clone() behavior
- Ensures both internal and external reference scenarios are covered
- Increases test coverage from 802 to 803 tests
@druizz90
Copy link
Contributor

Good job @giuliohome.

Why isn't this PR merged?

@mathi123
Copy link

@Ruwann @RobbeSneyders Ping! Would you mind merging this?

@chrisinmtown
Copy link
Contributor

Why isn't this PR merged?

I am critically dependent on this package and have the same question. FWIW I have personally never encountered a situation like this in the open-source world, with a popular (as far as I can tell) package and an active community, but disengaged maintainers. It's deeply concerning.

@coveralls
Copy link

Coverage Status

coverage: 94.353% (-0.001%) from 94.354%
when pulling 84dd1d5 on giuliohome:patch-1
into a1c53db on spec-first:main.

@RobbeSneyders
Copy link
Member

Thanks @giuliohome

I don't think the conditional handler is the way to go though. I assume the original issue addressed by #2002 was that the cloned spec was initialized without access to the base_uri of the original.

I believe the following should work:

  • Store the base_uri in init: self._base_uri = base_uri
  • Update clone to return type(self)(copy.deepcopy(self._raw_spec), base_uri=self._base_uri)

Could you try this and validate if your added test still passes?

@chrisinmtown
Copy link
Contributor

chrisinmtown commented Oct 6, 2025

hi @giuliohome I hope you don't mind me trying to follow the new suggestion and gather evidence about your PR. I made the change shown below, then ran tests. With this change in place, in my local the py3.9 .. py3.12 tests all pass including your new test case for external references. I think that's good news.

What about the py3.8 tests? When I try brew install [email protected] I get Error: [email protected] has been disabled because it is deprecated upstream! It was disabled on 2024-10-14. and that's keeping me from giving the suggested code a 100% pass result from here.

diff --git a/connexion/spec.py b/connexion/spec.py
index 2d95596..4206f02 100644
--- a/connexion/spec.py
+++ b/connexion/spec.py
@@ -77,6 +77,7 @@ class Specification(Mapping):
     operation_cls: t.Type[AbstractOperation]
 
     def __init__(self, raw_spec, *, base_uri=""):
+        self._base_uri = base_uri  # stash a reference
         self._raw_spec = copy.deepcopy(raw_spec)
         self._set_defaults(raw_spec)
         self._validate_spec(raw_spec)
@@ -204,12 +205,8 @@ class Specification(Mapping):
         return OpenAPISpecification(spec, base_uri=base_uri)
 
     def clone(self):
-        # Check if spec contains only internal refs (starting with #)
-        # For external refs, we need the processed spec to maintain resolved content
-        if self._has_only_internal_refs():
-            return type(self)(copy.deepcopy(self._raw_spec))
-        else:
-            return type(self)(copy.deepcopy(self._spec))
+        # use the base_uri stashed by __init__
+        return type(self)(copy.deepcopy(self._raw_spec), base_uri=self._base_uri)
 
     def _has_only_internal_refs(self):
         """Check if all $ref entries point to internal references only (starting with #)"""

@giuliohome
Copy link
Author

Thanks for getting back to this. I’ve since moved on to another solution, so please feel free to continue on your end or close this PR.

@chrisinmtown
Copy link
Contributor

@giuliohome please say, do you have a better solution to share with the community here?

@giuliohome
Copy link
Author

Ah, I just meant we moved to another library for our use case. The proposed code should still work fine for this specific PR, but more broadly we needed something with a bit more active maintenance.

@chrisinmtown
Copy link
Contributor

Ah, I just meant we moved to another library for our use case.

Is there a direct competitor to connexion? Please share the name, I have similar concerns about the lack of active maintenance.

@RobbeSneyders
Copy link
Member

Closing, replaced by #2089

RobbeSneyders added a commit that referenced this pull request Oct 13, 2025
This PR fixes an issue introduced in #2002, and the original issue #2002
was trying to address.

The original issue was that a cloned spec did not have properly resolved
references. #2002 fixed this incorrectly by cloning the resolved spec,
while the `Spec` initializer expects a raw spec.

This PR fixes this by cloning the raw spec, and passing the `base_uri`
required to resolve it along to the initializer of the new `Spec`
instance.

The swagger ui was also updated to use the resolved spec instead of the
raw spec.

Supersedes:
#1889
#2080

Fixes:
#1890
#1909
#2028 
#2029
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Swagger UI endpoint crashes at runtime with JSON Schema keywords like propertyNames or patternProperties

6 participants