Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pluggable Regexpr Engine #315

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 64 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ It's also possible to pass a `ReferenceLoader` to the `Compile` function that re
```go
err = sl.AddSchemas(loader3)
schema, err := sl.Compile(gojsonschema.NewReferenceLoader("http://some_host.com/main.json"))
```
```

Schemas added by `AddSchema` and `AddSchemas` are only validated when the entire schema is compiled, unless meta-schema validation is used.

Expand All @@ -211,7 +211,7 @@ If autodetection is on (default), a draft-07 schema can savely reference draft-0
## Meta-schema validation
Schemas that are added using the `AddSchema`, `AddSchemas` and `Compile` can be validated against their meta-schema by setting the `Validate` property.

The following example will produce an error as `multipleOf` must be a number. If `Validate` is off (default), this error is only returned at the `Compile` step.
The following example will produce an error as `multipleOf` must be a number. If `Validate` is off (default), this error is only returned at the `Compile` step.

```go
sl := gojsonschema.NewSchemaLoader()
Expand All @@ -222,8 +222,6 @@ err := sl.AddSchemas(gojsonschema.NewStringLoader(`{
"multipleOf" : true
}`))
```
```
```

Errors returned by meta-schema validation are more readable and contain more information, which helps significantly if you are developing a schema.

Expand All @@ -237,7 +235,7 @@ The library handles string error codes which you can customize by creating your
gojsonschema.Locale = YourCustomLocale{}
```

However, each error contains additional contextual information.
However, each error contains additional contextual information.

Newer versions of `gojsonschema` may have new additional errors, so code that uses a custom locale will need to be updated when this happens.

Expand Down Expand Up @@ -341,7 +339,7 @@ Not all formats defined in draft-07 are available. Implemented formats are:
* `json-pointer`
* `relative-json-pointer`

`email`, `uri` and `uri-reference` use the same validation code as their unicode counterparts `idn-email`, `iri` and `iri-reference`. If you rely on unicode support you should use the specific
`email`, `uri` and `uri-reference` use the same validation code as their unicode counterparts `idn-email`, `iri` and `iri-reference`. If you rely on unicode support you should use the specific
unicode enabled formats for the sake of interoperability as other implementations might not support unicode in the regular formats.

The validation code for `uri`, `idn-email` and their relatives use mostly standard library code.
Expand Down Expand Up @@ -452,13 +450,72 @@ func main() {
}

return result, err

}
```

This is especially useful if you want to add validation beyond what the
json schema drafts can provide such business specific logic.

## Custom regular expression implemenation
By default this libary uses Go's builtin [regexp](https://golang.org/pkg/regexp/) package which uses the
[RE2](/~https://github.com/google/re2/wiki/Syntax) engine that is not [ECMA262](http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf) compatible.

The regular expression library can be changed by implementing [RegexpProvider](regexpProvider.go) interface using your preferred regular expression implemenation, and setting `SchemaLoader.RegexpProvider`.

```go
import "github.com/dlclark/regexp2"

type regexp2Provider struct{}

type regexp2CompiledRegexp struct {
compiled *regexp2.Regexp
}

func (c regexp2CompiledRegexp) MatchString(s string) bool {
if matched, err := c.compiled.MatchString(s); err != nil {
return false
} else {
return matched
}
}

func (regexp2Provider) Compile(expr string) (gojsonschema.CompiledRegexp, error) {
if compiled, err := regexp2.Compile(expr, 0); err != nil {
return nil, err
} else {
return regexp2CompiledRegexp{compiled}, nil
}
}
sl := gojsonschema.NewSchemaLoader()
sl.RegexpProvider = Regexp2Provider{}
loader := gojsonschema.NewStringLoader(`{ "type" : "string", "pattern": "(?=foo)bar" }`)
schema, err := sl.Compile(loader)
```

Note the `regex` `FormatChecker` will still use `RE2` unless it is replaced.
```go
import "github.com/dlclark/regexp2"

Regex2FormatChecker struct{}

// IsFormat checks if input is a correctly formatted regular expression
func (f Regex2FormatChecker) IsFormat(input interface{}) bool {
asString, ok := input.(string)
if !ok {
return true
}

if asString == "" {
return true
}
_, err := regexp2.Compile(asString)
return err == nil
}

//replace golang regexp format checker with regexp2
gojsonschema.FormatCheckers.Add("regex", Regex2FormatChecker{})
```

## Uses

gojsonschema uses the following test suite :
Expand Down
32 changes: 32 additions & 0 deletions regexpProvider.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
package gojsonschema

import (
"regexp"
)

var (
defaultRegexProvider = golangRegexpProvider{}
)

// RegexpProvider An interface to a regex implementation
type RegexpProvider interface {
// Compile Compiles an expression and returns a CompiledRegexp
Compile(expr string) (CompiledRegexp, error)
}

// CompiledRegexp A compiled expression
type CompiledRegexp interface {
// MatchString Tests if the string matches the compiled expression
MatchString(s string) bool
}

type golangRegexpProvider struct {
}

func (golangRegexpProvider) Compile(expr string) (CompiledRegexp, error) {
return regexp.Compile(expr)
}

func getDefaultRegexpProvider() RegexpProvider {
return defaultRegexProvider
}
81 changes: 81 additions & 0 deletions regexpProvider_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
// Copyright 2018 johandorland ( /~https://github.com/johandorland )
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package gojsonschema

import (
"regexp"
"testing"

"github.com/stretchr/testify/assert"
)

type customCompiled struct {
pattern *regexp.Regexp
expr string
matchStringCalled int
}

func (cc *customCompiled) MatchString(s string) bool {
cc.matchStringCalled++
return cc.pattern.MatchString(s)
}

type customRegexpProvider struct {
compileCalled int
compiledRegexp map[string]*customCompiled
}

func (c *customRegexpProvider) Compile(expr string) (CompiledRegexp, error) {
c.compileCalled++
pattern, err := regexp.Compile(expr)
if err != nil {
return nil, err
}
cc := &customCompiled{
pattern: pattern,
expr: expr,
}
if c.compiledRegexp == nil {
c.compiledRegexp = make(map[string]*customCompiled)
}
c.compiledRegexp[expr] = cc
return cc, nil
}

func TestCustomRegexpProvider(t *testing.T) {
// Verify that the RegexpProvider is used
loader := NewStringLoader(`{
"patternProperties": {
"f.*o": {"type": "integer"},
"b.*r": {"type": "string", "pattern": "^a*$"}
}
}`)

sl := NewSchemaLoader()
customRegexpProvider := &customRegexpProvider{}
sl.RegexpProvider = customRegexpProvider
d, err := sl.Compile(loader)
assert.Nil(t, err)
assert.NotNil(t, d.regexp)

loader = NewStringLoader(`{"foo": 1, "foooooo" : 2, "bar": "a", "baaaar": "aaaa"}`)
r, err := d.Validate(loader)
assert.Nil(t, err)
assert.Empty(t, r.errors)
assert.Equal(t, 3, customRegexpProvider.compileCalled)
assert.Equal(t, 4, customRegexpProvider.compiledRegexp["f.*o"].matchStringCalled)
assert.Equal(t, 4, customRegexpProvider.compiledRegexp["b.*r"].matchStringCalled)
assert.Equal(t, 2, customRegexpProvider.compiledRegexp["^a*$"].matchStringCalled)
}
10 changes: 5 additions & 5 deletions schema.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ import (
"errors"
"math/big"
"reflect"
"regexp"
"text/template"

"github.com/xeipuuv/gojsonreference"
Expand All @@ -56,6 +55,7 @@ type Schema struct {
rootSchema *subSchema
pool *schemaPool
referencePool *schemaReferencePool
regexp RegexpProvider
}

func (d *Schema) parse(document interface{}, draft Draft) error {
Expand Down Expand Up @@ -320,9 +320,9 @@ func (d *Schema) parseSchema(documentNode interface{}, currentSchema *subSchema)
if isKind(m[KEY_PATTERN_PROPERTIES], reflect.Map) {
patternPropertiesMap := m[KEY_PATTERN_PROPERTIES].(map[string]interface{})
if len(patternPropertiesMap) > 0 {
currentSchema.patternProperties = make(map[string]*subSchema)
currentSchema.patternProperties = make(map[string]*patternProperties)
for k, v := range patternPropertiesMap {
_, err := regexp.MatchString(k, "")
pattern, err := d.regexp.Compile(k)
if err != nil {
return errors.New(formatErrorDescription(
Locale.RegexPattern(),
Expand All @@ -334,7 +334,7 @@ func (d *Schema) parseSchema(documentNode interface{}, currentSchema *subSchema)
if err != nil {
return errors.New(err.Error())
}
currentSchema.patternProperties[k] = newSchema
currentSchema.patternProperties[k] = &patternProperties{schema: newSchema, pattern: pattern}
}
}
} else {
Expand Down Expand Up @@ -652,7 +652,7 @@ func (d *Schema) parseSchema(documentNode interface{}, currentSchema *subSchema)

if existsMapKey(m, KEY_PATTERN) {
if isKind(m[KEY_PATTERN], reflect.String) {
regexpObject, err := regexp.Compile(m[KEY_PATTERN].(string))
regexpObject, err := d.regexp.Compile(m[KEY_PATTERN].(string))
if err != nil {
return errors.New(formatErrorDescription(
Locale.MustBeValidRegex(),
Expand Down
12 changes: 8 additions & 4 deletions schemaLoader.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,11 @@ import (

// SchemaLoader is used to load schemas
type SchemaLoader struct {
pool *schemaPool
AutoDetect bool
Validate bool
Draft Draft
pool *schemaPool
AutoDetect bool
Validate bool
Draft Draft
RegexpProvider RegexpProvider
}

// NewSchemaLoader creates a new NewSchemaLoader
Expand Down Expand Up @@ -153,6 +154,9 @@ func (sl *SchemaLoader) Compile(rootSchema JSONLoader) (*Schema, error) {
}

d := Schema{}
if d.regexp = sl.RegexpProvider; d.regexp == nil {
d.regexp = getDefaultRegexpProvider()
}
d.pool = sl.pool
d.pool.jsonLoaderFactory = rootSchema.LoaderFactory()
d.documentReference = ref
Expand Down
22 changes: 21 additions & 1 deletion schemaLoader_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,10 @@
package gojsonschema

import (
"github.com/stretchr/testify/require"
"testing"

"github.com/stretchr/testify/require"

"github.com/stretchr/testify/assert"
)

Expand Down Expand Up @@ -174,3 +175,22 @@ func TestParseSchemaURL_NotMap(t *testing.T) {
require.Error(t, err)
assert.EqualError(t, err, "schema is invalid")
}

func TestDefaultRegexpProvider(t *testing.T) {
//Verify that when no RegexpProvider is set, the default Regexp Provider is used
loader := NewStringLoader(`{
"patternProperties": {
"f.*o": {"type": "integer"},
"b.*r": {"type": "string", "pattern": "^a*$"}
}
}`)

d, err := NewSchema(loader)
assert.Nil(t, err)
assert.NotNil(t, d.regexp)

loader = NewStringLoader(`{"foo": 1, "foooooo" : 2, "bar": "a", "baaaar": "aaaa"}`)
r, err := d.Validate(loader)
assert.Nil(t, err)
assert.Empty(t, r.errors)
}
13 changes: 9 additions & 4 deletions subSchema.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,9 @@
package gojsonschema

import (
"github.com/xeipuuv/gojsonreference"
"math/big"
"regexp"

"github.com/xeipuuv/gojsonreference"
)

// Constants
Expand Down Expand Up @@ -76,6 +76,11 @@ const (
KEY_ELSE = "else"
)

type patternProperties struct {
schema *subSchema
pattern CompiledRegexp
}

type subSchema struct {
draft *Draft

Expand Down Expand Up @@ -113,7 +118,7 @@ type subSchema struct {
// validation : string
minLength *int
maxLength *int
pattern *regexp.Regexp
pattern CompiledRegexp
format string

// validation : object
Expand All @@ -123,7 +128,7 @@ type subSchema struct {

dependencies map[string]interface{}
additionalProperties interface{}
patternProperties map[string]*subSchema
patternProperties map[string]*patternProperties
propertyNames *subSchema

// validation : array
Expand Down
Loading