Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refact(validators) #169

Merged
merged 2 commits into from
Jan 27, 2024
Merged

refact(validators) #169

merged 2 commits into from
Jan 27, 2024

Conversation

fredbi
Copy link
Member

@fredbi fredbi commented Jan 20, 2024

Readability improvements:

  • Introduced private constructors for all validators
  • All private constructors can use a *SchemaValidatorOptions (generalize options passing pattern)
  • All exported methods use private constructors (hide the implementation of options)
  • Readability: reduced the level of indentation with early exit conditions than nested conditions
  • Object validator: refactored Validate in smaller functions
  • Removed dead code/commented out code

Minor efficiency improvements

  • In Applies method: reflect replaced by type switch whenever possible

  • Removed (mostly unusable) debugLog

  • In object validator:

    • pattern properties now use the regexp cache readily available
    • split path may be performed only once
  • In schema props: keep arrays of pointers []*SchemaValidator and avoid copying a new schema

  • In default & example validators:

    • map[string]bool -> map[string]struct{}
    • clear, don't reallocate the visited map
  • []validator slice -> [n]validator array

  • introduced a benchmark follow-up document, based on the existing Kubernetes API fixture

@fredbi fredbi force-pushed the refact/validators branch from 68d73d6 to 8a4f459 Compare January 20, 2024 19:23
Copy link

codecov bot commented Jan 20, 2024

Codecov Report

Attention: 11 lines in your changes are missing coverage. Please review.

Comparison is base (ff196e5) 92.40% compared to head (486e1d2) 92.88%.

Files Patch % Lines
spec.go 76.66% 5 Missing and 2 partials ⚠️
default_validator.go 93.18% 2 Missing and 1 partial ⚠️
example_validator.go 95.65% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #169      +/-   ##
==========================================
+ Coverage   92.40%   92.88%   +0.48%     
==========================================
  Files          22       22              
  Lines        2987     3277     +290     
==========================================
+ Hits         2760     3044     +284     
- Misses        182      190       +8     
+ Partials       45       43       -2     
Flag Coverage Δ
oldstable 92.88% <98.65%> (+0.55%) ⬆️
stable 92.88% <98.65%> (+0.48%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@fredbi fredbi force-pushed the refact/validators branch 2 times, most recently from f68d73b to 962cb8c Compare January 22, 2024 15:25
@fredbi fredbi requested review from casualjim and youyuanwu January 22, 2024 15:34
fredbi added a commit to fredbi/validate that referenced this pull request Jan 23, 2024
This PR is a follow-up on go-openapi#169, which cames with innocuous changes, but already
reduced the total number of allocations when validating a swagger spec.

We reduce further the pressure on go's GC by pooling temporarily
allocated objects: validators and results.

When recursively allocating validators, calls to "Validate()" redeem to the pool the allocated
validator. This means that a validator is usable only once.

A similar approach is taken for `Result`s: upon merge, the merged result
is redeemed to the pool.

Eventually, temporarily allocated `spec.Schema`s are also pooled.

In order not to break existing interfaces, this is enabled by default
only when validating a spec (`Spec()`) or when using
`AgainstSchema()`.

Another option at the spec level allows for skipping the gathering of schemas
into the result ("schemata"): these schemas must be cloned to remain
valid after their parent is redeemed to the pool.

Comment on the benchmark:

CPU profiling shows that about 30% of the CPU time is spent managing
garbage.

Memory profiling shows that our benchmark (validating the k8s API spec) goes
down from 25M allocated objects down to ~ 17M (go-openapi#169 went down from 60M to 25M).

The `Validate` method generates only ~ 5M of those, out of which
~ 4M are caused by the schema unmarshaling & expansion and 1M by string
allocations for the validator "path".
Unfortunately, classic string interning methods don't work well with this dynamic "path",
so I leave it there.

Integration tests (go-swagger/go-swagger#3064)
show a significant improvement when running go-swagger validate in codegen.

* contributes: go-swagger/go-swagger#2649

Further improvements would most likely come from optimizing the
validation of defaults and examples (which amount for 2M allocs).

Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
fredbi added a commit to fredbi/validate that referenced this pull request Jan 23, 2024
This PR is a follow-up on go-openapi#169, which cames with innocuous changes, but already
reduced the total number of allocations when validating a swagger spec.

We reduce further the pressure on go's GC by pooling temporarily
allocated objects: validators and results.

When recursively allocating validators, calls to "Validate()" redeem to the pool the allocated
validator. This means that a validator is usable only once.

A similar approach is taken for `Result`s: upon merge, the merged result
is redeemed to the pool.

Eventually, temporarily allocated `spec.Schema`s are also pooled.

In order not to break existing interfaces, this is enabled by default
only when validating a spec (`Spec()`) or when using
`AgainstSchema()`.

Another option at the spec level allows for skipping the gathering of schemas
into the result ("schemata"): these schemas must be cloned to remain
valid after their parent is redeemed to the pool.

Comment on the benchmark:

CPU profiling shows that about 30% of the CPU time is spent managing
garbage.

Memory profiling shows that our benchmark (validating the k8s API spec) goes
down from 25M allocated objects down to ~ 17M (go-openapi#169 went down from 60M to 25M).

The `Validate` method generates only ~ 5M of those, out of which
~ 4M are caused by the schema unmarshaling & expansion and 1M by string
allocations for the validator "path".
Unfortunately, classic string interning methods don't work well with this dynamic "path",
so I leave it there.

Integration tests (go-swagger/go-swagger#3064)
show a significant improvement when running go-swagger validate in codegen.

* contributes: go-swagger/go-swagger#2649

Further improvements would most likely come from optimizing the
validation of defaults and examples (which amount for 2M allocs).

Signed-off-by: Frederic BIDON <fredbi@yahoo.com>

fixed bug: attempted to reuse recycled validator

Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
fredbi added a commit to fredbi/validate that referenced this pull request Jan 25, 2024
This PR is a follow-up on go-openapi#169, which cames with innocuous changes, but already
reduced the total number of allocations when validating a swagger spec.

We reduce further the pressure on go's GC by pooling temporarily
allocated objects: validators and results.

When recursively allocating validators, calls to "Validate()" redeem to the pool the allocated
validator. This means that a validator is usable only once.

A similar approach is taken for `Result`s: upon merge, the merged result
is redeemed to the pool.

Eventually, temporarily allocated `spec.Schema`s are also pooled.

In order not to break existing interfaces, this is enabled by default
only when validating a spec (`Spec()`) or when using
`AgainstSchema()`.

Another option at the spec level allows for skipping the gathering of schemas
into the result ("schemata"): these schemas must be cloned to remain
valid after their parent is redeemed to the pool.

Comment on the benchmark:

CPU profiling shows that about 30% of the CPU time is spent managing
garbage.

Memory profiling shows that our benchmark (validating the k8s API spec) goes
down from 25M allocated objects down to ~ 17M (go-openapi#169 went down from 60M to 25M).

The `Validate` method generates only ~ 5M of those, out of which
~ 4M are caused by the schema unmarshaling & expansion and 1M by string
allocations for the validator "path".
Unfortunately, classic string interning methods don't work well with this dynamic "path",
so I leave it there.

Integration tests (go-swagger/go-swagger#3064)
show a significant improvement when running go-swagger validate in codegen.

* contributes: go-swagger/go-swagger#2649

Further improvements would most likely come from optimizing the
validation of defaults and examples (which amount for 2M allocs).

Signed-off-by: Frederic BIDON <fredbi@yahoo.com>

fixed bug: attempted to reuse recycled validator

Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Readability improvements:
* Introduced private constructors for all validators
* All private constructors can use a *SchemaValidatorOptions (generalize
  options passing pattern)
* All exported methods use private constructors (hide the implementation
  of options)
* Readability: reduced the level of indentation with early exit
  conditions than nested conditions
* Object validator: refactored Validate in smaller functions
* Removed dead code/commented out code

Minor efficiency improvements

* In Applies method: reflect replaced by type switch whenever possible
* Removed (mostly unusable) debugLog
* In object validator:
  * pattern properties now use the regexp cache readily available
  * split path may be performed only once
* In schema props: keep arrays of pointers []*SchemaValidator and avoid
  copying a new schema
* In default & example validators:
  * map[string]bool -> map[string]struct{}
  * clear, don't reallocate the visited map
* []validator slice -> [n]validator array

* introduced a benchmark follow-up document, based on the existing Kubernetes API
  fixture

Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
@fredbi fredbi force-pushed the refact/validators branch from 08dc4ea to 1c0bbce Compare January 25, 2024 11:39
fredbi added a commit to fredbi/validate that referenced this pull request Jan 25, 2024
This PR is a follow-up on go-openapi#169, which cames with innocuous changes, but already
reduced the total number of allocations when validating a swagger spec.

We reduce further the pressure on go's GC by pooling temporarily
allocated objects: validators and results.

When recursively allocating validators, calls to "Validate()" redeem to the pool the allocated
validator. This means that a validator is usable only once.

A similar approach is taken for `Result`s: upon merge, the merged result
is redeemed to the pool.

Eventually, temporarily allocated `spec.Schema`s are also pooled.

In order not to break existing interfaces, this is enabled by default
only when validating a spec (`Spec()`) or when using
`AgainstSchema()`.

Another option at the spec level allows for skipping the gathering of schemas
into the result ("schemata"): these schemas must be cloned to remain
valid after their parent is redeemed to the pool.

Comment on the benchmark:

CPU profiling shows that about 30% of the CPU time is spent managing
garbage.

Memory profiling shows that our benchmark (validating the k8s API spec) goes
down from 25M allocated objects down to ~ 17M (go-openapi#169 went down from 60M to 25M).

The `Validate` method generates only ~ 5M of those, out of which
~ 4M are caused by the schema unmarshaling & expansion and 1M by string
allocations for the validator "path".
Unfortunately, classic string interning methods don't work well with this dynamic "path",
so I leave it there.

Integration tests (go-swagger/go-swagger#3064)
show a significant improvement when running go-swagger validate in codegen.

* contributes: go-swagger/go-swagger#2649

Further improvements would most likely come from optimizing the
validation of defaults and examples (which amount for 2M allocs).

Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
fredbi added a commit to fredbi/validate that referenced this pull request Jan 25, 2024
This PR is a follow-up on go-openapi#169, which cames with innocuous changes, but already
reduced the total number of allocations when validating a swagger spec.

We reduce further the pressure on go's GC by pooling temporarily
allocated objects: validators and results.

When recursively allocating validators, calls to "Validate()" redeem to the pool the allocated
validator. This means that a validator is usable only once.

A similar approach is taken for `Result`s: upon merge, the merged result
is redeemed to the pool.

Eventually, temporarily allocated `spec.Schema`s are also pooled.

In order not to break existing interfaces, this is enabled by default
only when validating a spec (`Spec()`) or when using
`AgainstSchema()`.

Another option at the spec level allows for skipping the gathering of schemas
into the result ("schemata"): these schemas must be cloned to remain
valid after their parent is redeemed to the pool.

Comment on the benchmark:

CPU profiling shows that about 30% of the CPU time is spent managing
garbage.

Memory profiling shows that our benchmark (validating the k8s API spec) goes
down from 25M allocated objects down to ~ 17M (go-openapi#169 went down from 60M to 25M).

The `Validate` method generates only ~ 5M of those, out of which
~ 4M are caused by the schema unmarshaling & expansion and 1M by string
allocations for the validator "path".
Unfortunately, classic string interning methods don't work well with this dynamic "path",
so I leave it there.

Integration tests (go-swagger/go-swagger#3064)
show a significant improvement when running go-swagger validate in codegen.

* contributes: go-swagger/go-swagger#2649

Further improvements would most likely come from optimizing the
validation of defaults and examples (which amount for 2M allocs).

Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
object_validator_test.go Outdated Show resolved Hide resolved
Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
fredbi added a commit to fredbi/validate that referenced this pull request Jan 27, 2024
This PR is a follow-up on go-openapi#169, which cames with innocuous changes, but already
reduced the total number of allocations when validating a swagger spec.

We reduce further the pressure on go's GC by pooling temporarily
allocated objects: validators and results.

When recursively allocating validators, calls to "Validate()" redeem to the pool the allocated
validator. This means that a validator is usable only once.

A similar approach is taken for `Result`s: upon merge, the merged result
is redeemed to the pool.

Eventually, temporarily allocated `spec.Schema`s are also pooled.

In order not to break existing interfaces, this is enabled by default
only when validating a spec (`Spec()`) or when using
`AgainstSchema()`.

Another option at the spec level allows for skipping the gathering of schemas
into the result ("schemata"): these schemas must be cloned to remain
valid after their parent is redeemed to the pool.

Comment on the benchmark:

CPU profiling shows that about 30% of the CPU time is spent managing
garbage.

Memory profiling shows that our benchmark (validating the k8s API spec) goes
down from 25M allocated objects down to ~ 17M (go-openapi#169 went down from 60M to 25M).

The `Validate` method generates only ~ 5M of those, out of which
~ 4M are caused by the schema unmarshaling & expansion and 1M by string
allocations for the validator "path".
Unfortunately, classic string interning methods don't work well with this dynamic "path",
so I leave it there.

Integration tests (go-swagger/go-swagger#3064)
show a significant improvement when running go-swagger validate in codegen.

* contributes: go-swagger/go-swagger#2649

Further improvements would most likely come from optimizing the
validation of defaults and examples (which amount for 2M allocs).

Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
@fredbi fredbi merged commit 7c3e606 into go-openapi:master Jan 27, 2024
10 checks passed
@fredbi fredbi deleted the refact/validators branch January 27, 2024 16:45
fredbi added a commit to fredbi/validate that referenced this pull request Jan 27, 2024
This PR is a follow-up on go-openapi#169, which cames with innocuous changes, but already
reduced the total number of allocations when validating a swagger spec.

We reduce further the pressure on go's GC by pooling temporarily
allocated objects: validators and results.

When recursively allocating validators, calls to "Validate()" redeem to the pool the allocated
validator. This means that a validator is usable only once.

A similar approach is taken for `Result`s: upon merge, the merged result
is redeemed to the pool.

Eventually, temporarily allocated `spec.Schema`s are also pooled.

In order not to break existing interfaces, this is enabled by default
only when validating a spec (`Spec()`) or when using
`AgainstSchema()`.

Another option at the spec level allows for skipping the gathering of schemas
into the result ("schemata"): these schemas must be cloned to remain
valid after their parent is redeemed to the pool.

Comment on the benchmark:

CPU profiling shows that about 30% of the CPU time is spent managing
garbage.

Memory profiling shows that our benchmark (validating the k8s API spec) goes
down from 25M allocated objects down to ~ 17M (go-openapi#169 went down from 60M to 25M).

The `Validate` method generates only ~ 5M of those, out of which
~ 4M are caused by the schema unmarshaling & expansion and 1M by string
allocations for the validator "path".
Unfortunately, classic string interning methods don't work well with this dynamic "path",
so I leave it there.

Integration tests (go-swagger/go-swagger#3064)
show a significant improvement when running go-swagger validate in codegen.

* contributes: go-swagger/go-swagger#2649

Further improvements would most likely come from optimizing the
validation of defaults and examples (which amount for 2M allocs).

Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
fredbi added a commit that referenced this pull request Feb 1, 2024
This PR is a follow-up on #169, which cames with innocuous changes, but already
reduced the total number of allocations when validating a swagger spec.

We reduce further the pressure on go's GC by pooling temporarily
allocated objects: validators and results.

When recursively allocating validators, calls to "Validate()" redeem to the pool the allocated
validator. This means that a validator is usable only once.

A similar approach is taken for `Result`s: upon merge, the merged result
is redeemed to the pool.

Eventually, temporarily allocated `spec.Schema`s are also pooled.

In order not to break existing interfaces, this is enabled by default
only when validating a spec (`Spec()`) or when using
`AgainstSchema()`.

Another option at the spec level allows for skipping the gathering of schemas
into the result ("schemata"): these schemas must be cloned to remain
valid after their parent is redeemed to the pool.

Comment on the benchmark:

CPU profiling shows that about 30% of the CPU time is spent managing
garbage.

Memory profiling shows that our benchmark (validating the k8s API spec) goes
down from 25M allocated objects down to ~ 17M (#169 went down from 60M to 25M).

The `Validate` method generates only ~ 5M of those, out of which
~ 4M are caused by the schema unmarshaling & expansion and 1M by string
allocations for the validator "path".
Unfortunately, classic string interning methods don't work well with this dynamic "path",
so I leave it there.

Integration tests (go-swagger/go-swagger#3064)
show a significant improvement when running go-swagger validate in codegen.

* contributes: go-swagger/go-swagger#2649

Further improvements would most likely come from optimizing the
validation of defaults and examples (which amount for 2M allocs).

Signed-off-by: Frederic BIDON <fredbi@yahoo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants