The json
, xml
and html
parsers have the most complex source files after the dynamic
parser.
It is a JSON file, and it can ΝΟΤ run only by providing an url, a lot of configuration is needed.
It is a must to know the structure of the response that is going to be parsed.
The identifier of the articles' container in the response.
The selected node will then be iterated for each child element and execute the
options inside article
.
For example, for this JSON response the container will be ["results"]
.
{
"count": 5,
"results": [
// ...
]
}
If the root of the node must be iterated, then set container
as an empty array.
Not only arrays can be iterated, the same applies for JSON nodes:
{
"key1": {
//...
},
"key2": {
//...
},
// ...
}
It will contain all the configuration needed to fill each article field.
The fields and their options are in a key-value relationship.
{
"title": {/* options */},
"content": {/* options */},
// ...
}
The following standard keys will be assigned as the root fields in the article:
title
, content
, pubDate
, link
, attachments
, categories
and thumbnail
.
All the other fields mentioned here will be moved to the extras
field in a key-value
relationship.
How deep to go in the node step by step.
For example, for this JSON node the find
for the title will be ["data", "title"]
.
{
"data": {
"title": "I am the title"
}
}
There are cases where the needed information is displayed in multiple places.
This option will append the current result found at the end of the parent
option.
In case of multiple options having the same parent, they results will be appended
in the order in which they were defined.
This JSON node:
{
"id": "article1",
"data": {
"title": {
"first": "First part of title",
"second": "Second part of title",
"third": "Third part of title"
}
}
}
will have the following options:
{
// ...
"article": {
"title": {
"find": ["data", "title", "first"],
},
"title-second": {
"find": ["data", "title", "second"],
"parent": "title"
},
"title-third": {
"find": ["data", "title", "third"],
"parent": "title"
}
// ...
}
}
The extra options title-second
and title-third
will also be added to the extras
field of the article.
Assign a static string to the specified field. All the other options are omitted.
{
"link": {
"static": "https://example.com"
}
}
or combined with the parent
option:
{
"title": {
"find": ["data", "title", "first"],
},
"separator": {
"static": " ~ ",
"parent": "title"
},
"title-second": {
"find": ["data", "title", "second"],
"parent": "title"
}
}
When it is needed to skip an element inside container
that matches some criteria.
You can skip a node based on:
- the position of that node inside the response
- that node's text
Position always starts from zero.
{
// ...
"skip": [
{
"position": 3 // skip the fourth element
}
]
}
{
// ...
"skip": [
{
"text": "Text to skip",
"type": "exact" // or "contains"
}
]
}
In cases where we need to check a nested item, we can use the option find
,
to locate the desired key inside the iterated node. For this response
{
"results": [
{
"key": {
"test": "I want to be skipped"
},
"title": "...",
// ....
}
]
}
the skip option will include:
{
"skip": [
{
"find": ["key", "test"],
"text": "I want to be skipped",
"type": "exact" // or "contains"
}
]
}