Uruk is the Clojure wrapper for MarkLogic's XML Content Connector for Java (XCC/J). Uruk empowers you to access your Enterprise NoSQL database from Clojure.
With Uruk, you can use MarkLogic's XCC API to:
- evaluate stored XQuery programs
- dynamically construct and evaluate XQuery programs
- manage documents and stream inserts
The name Uruk comes from the ancient Mesopotamian city-state and period in which some of the oldest known writing has been found. One can see Uruk as perhaps the first document database—and it certainly wasn’t organized relationally.
Uruk is used in production and is under active maintenance. This project is sponsored by LambdaWerk. For commercial support inquiries please get in touch at dave.liepmann@gmail.com.
Uruk is part of the XQuery-mode stack for working with XQuery in emacs.
To install, add the following dependency to your project.clj dependencies: [uruk "0.3.11"]
In your namespace: (:require [uruk.core :as uruk])
. (I also like ur
as an alias, for brevity. Delightfully, Ur is another ancient city-state with ties to the origins of written documents.)
To run Uruk locally, you need MarkLogic installed on your machine. To run Uruk's tests or examples, see configuring MarkLogic for Uruk below.)
Online API docs via Codox and autodoc. Uruk documentation is also available on cljdoc.
For some background, see the XCC Developer's Guide and the MarkLogic XCC Javadoc to understand what Uruk is talking to.
For examples of how to use specific types and functions, see test/uruk/core_test.clj
. Examples in this README are included for reference in src/uruk/examples/readme.clj
.
To run Uruk's tests or evaluate its examples directly in a REPL, you'll need to configure MarkLogic on your machine to match the settings Uruk expects. If you have an existing MarkLogic install, feel free to skip these steps and instead point your REPL at your own database.
-
Install and start a local MarkLogic server via the Install Instructions.
-
Create a forest named "UrukForest"
-
Create a database named "UrukDB". Attach it to UrukForest but otherwise leave use the default settings.
-
Create an XDBC Server named "UrukServer" on port 8383.
-
Create role
uruk-tester-role
with URI privilegeview-uri
, execute-privilegesany-uri
,xdmp:external-binary
, andxdmp:timestamp
, and all the default document permissions (node-update
,execute
,update
,insert
, andread
) forxa
(these are all needed for specific tests). -
Create user
uruk-tester
with password "password" and roles ofxa
anduruk-tester-role
. This will be used to run tests and README examples. -
Finally, add environment variable
URUK_TEST_IMG_PATH
(e.g.export URUK_TEST_IMG_PATH=/path/to/uruk/resources/ml-favicon.ico
) to your Bash profile (.bashrc) and make sure it's available to your environment.
You should now be able to run lein test
and, if you start up a REPL, the examples in test/uruk/core_test.clj.
For ease of replication, the examples below are also in src/uruk/examples/readme.clj
.
Basic usage takes the form of:
(with-open [session (uruk/create-session {:uri xdbc-uri :content-base database-name
:user database-user :password database-pwd})]
(uruk/execute-xquery session xquery-string))
...of which a concrete example is:
(with-open [session (uruk/create-session {:uri "xdbc://localhost:8383/"
:user "uruk-tester" :password "password"})]
(uruk/execute-xquery session "\"hello world\""))
...which in this case should return ("hello world")
(if you provide valid credentials).
Let's def
our database information for brevity in the rest of our examples:
(def db {:uri "xdbc://localhost:8383/"
:user "uruk-tester" :password "password"
:content-base "UrukDB"})
Using that database info, let's take an overview of query functionality. Most use cases are handled by passing an optional configuration map to functions execute-query
or execute-module
, like so:
(with-open [session (uruk/create-session db)]
(uruk/execute-xquery session
"xquery version \"1.0-ml\"; doc('/bigdoc.xml')"
{:types :raw
:options {:cache-result false}
:variables {:a "a"}
:shape :single}))
Each optional key in that configuration map is described below.
Basic type conversion is performed automatically for most XCC types. If for any reason you need access to the raw results, use the :types
key in the config map, passing :raw
like so:
(with-open [session (uruk/create-session db)]
(uruk/execute-xquery session "\"hello world\"" {:types :raw}))
=> #object[com.marklogic.xcc.impl.CachedResultSequence 0x2c034c22 "CachedResultSequence: size=1, closed=false, cursor=-1"]
This lets you inspect result types with result->type
:
(with-open [session (uruk/create-session db)]
(uruk/result->type (uruk/execute-xquery session "\"hello world\"" {:types :raw})))
=> "xs:string"
Those result types are matched with :xml-name
values in the xcc-types
look-up table, which contains the :ml->clj
function that Uruk uses to transform result items into more manageable Clojure types. (For most types that’s as simple as #(.asString %)
(for XdmDocuments) or reading the number contained in a string. But if you need more in-depth handling of results, you can override the default mappings a la carte by passing a map to the aforementioned types
parameter, like so:
(with-open [session (uruk/create-session db)]
(uruk/execute-xquery session
"xquery version \"1.0-ml\"; doc('/dir/unwieldy.xml')"
{:types {"document-node()" #(custom-function %)})})
The keys for this map are used to look up :xml-name
, and the values replace :ml->clj
.
For convenience, you can mold query results by specifying :shape
in the configuration map:
:shape value |
Result |
---|---|
nil |
ignore response, returning nil |
:single |
return just the first element of the response |
:single! |
if the response is one element, return just that element; if not (i.e. if the response is more than one element) throw an error |
anything else | return response as-is |
For example, to clean up our simple example from earlier:
(with-open [session (uruk/create-session db)]
(uruk/execute-xquery session "\"hello world\"" {:shape :single}))
=> "hello world"
Uruk enables you to set Request options on your queries.
Request options are passed as a map to the :options
key in the config map. All keys in that inner map must be present in valid-request-options
. For example, to retrieve a document as a stream, use the :cache-result
request option, which corresponds to MarkLogic's RequestOptions.setCacheResult
. (Notice that we also specify no type conversion, because otherwise we would get the document content itself.)
(with-open [sess (uruk/create-session db)]
(uruk/execute-xquery sess "xquery version \"1.0-ml\"; doc('/content-factory/new-doc')"
{:types :raw
:options {:cache-result false}}))
=> #object[com.marklogic.xcc.impl.StreamingResultSequence 0x6d7f6 "StreamingResultSequence: closed=true"]
Uruk empowers you to pass XDM variables to your query, through the :variables
key in the configuration map. Variables are most easily passed as a simple mapping from name keys to String values, like so:
(with-open [session (uruk/create-session db)]
(uruk/execute-xquery session "xquery version \"1.0-ml\";
declare variable $my-variable as xs:string external;
$my-variable"
{:variables {"my-variable" "my-value"}
:shape :single!}))
If you need a non-XS_STRING variable, then use the more nuanced map-of-variables syntax:
(with-open [session (uruk/create-session db)]
(uruk/execute-xquery session "xquery version \"1.0-ml\";
declare variable $my-variable as xs:integer external;
$my-variable"
{:variables {"my-variable" {:value 1
:type :xs-integer}}
:shape :single!}))
The value for type
should be a keyword corresponding to a key in variable-types
, e.g. :document
for XML documents (ValueType/DOCUMENT
). It defaults to XS_STRING
if :type
is not specified. For example, the first simple variables map example above could also be described as {"my-variable" {:value "my-value"}}
.
Depending on the XdmValue type, conversion of expected Clojure values is automatic, for instance with this booleanNode:
(with-open [session (uruk/create-session db)]
(uruk/execute-xquery session "xquery version \"1.0-ml\";
declare variable $my-variable as boolean-node() external;
$my-variable"
{:variables {"my-variable" {:value false
:type :boolean-node}}
:shape :single!}))
Of particular interest is that variables that are XML document-nodes or elements can be created by passing either a String representation, a hiccup-style vector, or a clojure.data.xml.node.Element
. (Uruk uses clojure.data.xml 0.1.0-beta2
in order to get its namespace support.)
Values are converted according to the :clj->xdm
key in xcc-types
. If you need to override those conversions, set the :as-is?
key to true
inside the map describing the variable. This puts the onus of producing the correct object on you. For instance, we could set :as-is?
for that booleanNode
:
(with-open [session (uruk/create-session db)]
(uruk/execute-xquery session "xquery version \"1.0-ml\";
declare variable $my-variable as boolean-node() external;
$my-variable"
{:variables {"my-variable" {:value (-> (com.fasterxml.jackson.databind.node.JsonNodeFactory/instance)
(.booleanNode false)
ValueFactory/newBooleanNode)
:type :boolean-node
:as-is? true}}
:shape :single!}))
The variables map syntax also accepts a :namespace
key.
In addition to the basic create-session
function that we've been
using thus far, Uruk also supports session creation through all the
various
ContentSourceFactory methods
in MarkLogic. Functions make-uri-content-source
,
make-hosted-content-source
, and make-cp-content-source
are used to
create ContentSource objects that can be manipulated for more complex
session-management processes in your application. Note also that
create-default-session
lets you create sessions by directly invoking
the default login credentials of your content sources.
Multiple database updates that must occur together can take advantage of transactions. To borrow an example from the XCC Developer’s Guide:
The following example demonstrates using multi-statement transactions in Java. The first multi-statement transaction in the session inserts two documents into the database, calling Session.commit to complete the transaction and commit the updates. The second transaction demonstrates the use of Session.rollback. The third transaction demonstrates implicitly rolling back updates by closing the session.
– Programming in XCC > Multi-Statement Transactions
We translate the original Java to Clojure, taking advantage of Clojure’s with-open
idiom:
;; Open a session and configure it to trigger multi-statement transaction use:
(with-open [session (uruk/create-session db {:auto-commit? false :update-mode true})]
;; The first request (query) starts a new, multi-statement transaction:
(uruk/execute-xquery session "xdmp:document-insert('/docs/mst1.xml', <data><stuff/></data>)")
;; This second request executes in the same transaction as the
;; previous request and sees the results of the previous update:
(uruk/execute-xquery session "xdmp:document-insert('/docs/mst2.xml', fn:doc(\"/docs/mst1.xml\"));")
;; After commit, updates are visible to other transactions. Commit
;; ends the transaction after current statement completes.
(uruk/commit session) ;; <—- Transaction ends; updates are kept
;; Rollback discards changes and ends the transaction. The following
;; document deletion query never occurs, since it is rolled back
;; before calling commit:
(uruk/execute-xquery session "xdmp:document-delete('/docs/mst1.xml')")
(uruk/rollback session) ;; <– Transaction ends; updates are lost
;; Closing session without calling commit causes a rollback. The
;; following update is lost, since we don't commit before the end of
;; the (with-open) and its implicit `.close`:
(uruk/execute-xquery session "xdmp:document-delete('/docs/mst1.xml')"))
You can insert clojure.data.xml.node.Element
objects as content:
(with-open [session (uruk/create-session db)]
(uruk/insert-element session
"/content-factory/new-doc" ;; uri to insert at
(clojure.data.xml/element :foo)))
This function takes an optional map describing document metadata, including Content Creation Options to use during the insert. For example:
(with-open [session (uruk/create-session db)]
(uruk/insert-element session
"/content-factory/another-new-doc"
(clojure.data.xml/element :bar)
{:quality 2}))
See uruk.core/valid-content-creation-options
, which is a Clojurey version of the possibilities described by ContentCreateOptions.
You can also directly insert text as content, in any of MarkLogic's supported forms (text, binary, JSON, XML):
(with-open [session (uruk/create-session db)]
(uruk/insert-string session
"/content-factory/new-text-doc" ;; uri to insert at
"<abc>def</abc>"))
The insert-string
function used here automatically detects string type and inserts the correct type of content. For instance, in this example, the string will be automatically inserted as XML, since clojure.data.xml/parse-str
successfully parses it as XML. This function takes options just like insert-element
.
Uruk is sturdy and ready for production. However, some aspects of the XCC/J API have not yet been implemented:
- JNDI
- XCC Service Provider Interface -- note the MarkLogic disclaimer that this is for advanced users only, not endorsed for independent use, and "use at your own risk"
- ResultChannelName
- update clojure.data.xml preview dependency--see /~https://github.com/clojure/data.xml/blob/master/CHANGES.md
- look into possibly using clojure.spec (once Clojure 1.9 is stable)
- (breaking change) consider namespaced keys for various config options
- generative testing (for instance, in
as-expected-session-config?
) - ensure
insert-element
robustly covers needed use cases - possibly implement REx to automatically parse XQuery for XDM variable types
- possibly implement
use-fixtures
within tests to create user with appropriate permissions
Copyright © 2016-2018 David Liepmann
Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.