Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add 'json_stringify' and 'jaq' (jq) built-in functions #2557

Open
liquidaty opened this issue Dec 31, 2024 · 5 comments
Open

add 'json_stringify' and 'jaq' (jq) built-in functions #2557

liquidaty opened this issue Dec 31, 2024 · 5 comments

Comments

@liquidaty
Copy link

So I just did a little experiment by adding two functons to my own just repo: json_stringify() and jaq() (rust implementation of jq)

They worked fabulously. I would like to propose incorporating them into the official just.

These alone can fill a huge portion of the gap that is needed to make a no-shell option viable (in fact, that's what I used them for), such as #1570 and #2458, as well as related use cases discussed in #528, #537, #2379, #2080, and probably many others.

jq_filter := 'split(" ") | .[] | <whatever else I want to do here>'

jq_input := json_stringify(my_input_str)
jq_output := jaq(jq_input, jq_filter)

recipe:
    blah {{ jq_output }} blah

Granted, using jq is a particular approach that may or may suit the user and for any given case might not be the perfect choice compared with potential external alternatives-- but consider:

  • it is extremely versatile and does not require a shell
  • in may cases is provides a built-in solution where no other built-in solution exists and there is no assurance that an external solution exists
  • its implementation is clean and unintrusive (merely add a couple functions totalling 73 LOC in src/function.rs)
  • it is does not conflict with any future alternative other solution
  • it does not introduce any platform dependencies or conflicts (the added code compiled out-of-the-box even to wasm)
  • it incorporates a well-established and widely-used library (treating jq and jaq as one and the same for this purpose)

Thoughts?

FYI, the changes in function.rs were merely as follows (note: this is a super quick-and-dirty whip-up for illustrative purposes). Happy to submit a PR:

diff --git a/src/function.rs b/src/function.rs
index 66e7c6e..82c40fc 100644
--- a/src/function.rs
+++ b/src/function.rs
@@ -71,6 +71,8 @@ pub(crate) fn get(name: &str) -> Option<Function> {
     "invocation_directory_native" => Nullary(invocation_directory_native),
     "is_dependency" => Nullary(is_dependency),
     "join" => BinaryPlus(join),
+    "jaq" => Binary(jaq),
+    "json_stringify" => UnaryPlus(json),
     "just_executable" => Nullary(just_executable),
     "just_pid" => Nullary(just_pid),
     "justfile" => Nullary(justfile),
@@ -369,6 +371,60 @@ fn prepend(_context: Context, prefix: &str, s: &str) -> FunctionResult {
   )
 }

+ // invalid_date is a helper function for jaq, probably should be located elsewhere
+fn invalid_data(e: impl std::error::Error + Send + Sync + 'static) -> std::io::Error {
+   use std::io;
+    io::Error::new(io::ErrorKind::InvalidData, e)
+}
+
+fn jaq(_context: Context, input_str: &str, filter_str: &str) -> FunctionResult {
+   use jaq_core::{load, Compiler, Ctx, RcIter};
+   use jaq_json::Val;
+//   println!("input: {}", input_str);
+//   println!("filter: {}", filter_str);
+
+   let json = |s: String| {
+       use hifijson::token::Lex;
+       hifijson::SliceLexer::new(s.as_bytes())
+           .exactly_one(Val::parse)
+           .map_err(invalid_data)
+   };
+
+   let input = json(input_str.to_string());
+   let program = File { code: filter_str, path: () };
+
+   use load::{Arena, File, Loader};
+
+   let loader = Loader::new(jaq_std::defs().chain(jaq_json::defs()));
+   let arena = Arena::default();
+
+   // parse the filter
+   let modules = loader.load(&arena, program).unwrap();
+
+   // compile the filter
+   let filter = Compiler::default()
+     .with_funs(jaq_std::funs().chain(jaq_json::funs()))
+     .compile(modules)
+     .unwrap();
+
+   let inputs = RcIter::new(core::iter::empty());
+
+   // iterator over the output values
+   let mut out = filter.run((Ctx::new([], &inputs), input.unwrap()));
+
+   // collect result values, each on a separate line
+   let mut output_str = String::new();
+   while let Some(value) = out.next() {
+      output_str.push_str(value.unwrap().to_string().as_str());
+      output_str.push('\n');
+   }
+   if output_str.ends_with('\n') {
+     output_str.pop();
+   }
+
+   Ok(output_str)
+}
+
 fn join(_context: Context, base: &str, with: &str, and: &[String]) -> FunctionResult {
   let mut result = Utf8Path::new(base).join(with);
   for arg in and {
@@ -434,6 +490,22 @@ fn justfile_directory(context: Context) -> FunctionResult {
     })
 }

+
+fn json_stringify(_context: Context, first_arg: &str, more_args: &[String]) -> FunctionResult {
+  use serde_json::json;
+    let result = if more_args.is_empty() {
+        // If no additional arguments, return JSON stringified version of the first argument
+        json!(first_arg).to_string()
+    } else {
+        // If additional arguments exist, create a JSON array with the first argument followed by the additional arguments
+        let mut args = vec![first_arg.to_string()];
+        args.extend_from_slice(more_args);
+        json!(args).to_string()
+    };
+
+    Ok(result)
+}
+
 fn kebabcase(_context: Context, s: &str) -> FunctionResult {
   Ok(s.to_kebab_case())
 }
@casey
Copy link
Owner

casey commented Jan 3, 2025

Just has a very strong backwards compatibility guarantee, so I don't think we could expose something like jaq to the user, which is likely to change in the future.

We would probably either wind up having to use an unmaintained, older version, have to make a backwards incompatible change upgrading to a new version.

That being said, can you give some example concrete use cases? I want to understand what this provides.

@liquidaty
Copy link
Author

Just has a very strong backwards compatibility guarantee, so I don't think we could expose something like jaq to the user, which is likely to change in the future.

Hm, would that same argument then apply to any external library that exposes a syntax defined by that library? If so, isn't most of the discussion in #2532 also moot?

In any case, wouldn't it still be possible that it is easier/better to incorporate something mature (the original /~https://github.com/jqlang/jq has been around for a while, I am not aware of any time that either jq or jaq has made a backward-incompatible change, and both have been quite heavily tested), but anyway incorporating a snapshot wouldn't be any different from what Apple did with a bunch of BSD code.

@liquidaty
Copy link
Author

liquidaty commented Jan 3, 2025

That being said, can you give some example concrete use cases? I want to understand what this provides

Sure, we could look at how this could be used as an alternative to the self-described "not very elegant" each loop solution posted at #1570 (comment).

update-dc service="all":
  #!/usr/bin/env node
  let srv = "{{service}}"
  let services = [
    "anime",
    "homebox",
    "hass",
    "tubesync",
    "photoprism",
    "tvheadend",
    "jellyfin",
    "jdownloader",
    "notes"
  ]
  console.log("==> Service: "+srv)
  function update_one(s) {
    console.log("Updating "+s+"...")
  }
  if(srv == "all") {
    services.forEach(v => update_one(v))
  } else if(services.indexOf(srv) >= 0) {
    update_one(srv)
  } else {
    console.error("Unknown service: "+srv)
    process.exit(1)
  }

With json_stringify and jaq, the result is possibly a bit more elegant, and is fully shell-less (at least for the looping logic):

all_services := json_stringify("anime", "homebox", "hass", "tubesync", "photoprism", \
                               "tvheadend", "jellyfin", "jdownloader", "notes")

update-dc service="all":
  echo {{"==> Service: " + service}}
  {{ if service == 'all' { "" } else if jaq(all_services,"index(" + json_stringify(service) + ")") != "null" { "" } \
     else { "echo 'Unknown service: " + service + "' && exit 1" } }}
  {{ jaq(if service == 'all' { all_services } else { '[' + json_stringify(service) + ']' }, '.[]|"echo Updating "+.+"..."') }}

(Note that the above assumes that the "--raw" output flag is set which is missing the diff code I posted earlier)

@casey
Copy link
Owner

casey commented Jan 10, 2025

Hm, would that same argument then apply to any external library that exposes a syntax defined by that library? If so, isn't most of the discussion in #2532 also moot?

Yes, definitely. For example, we use the dotenvy crate for dotfile parsing. But in the case of dotenvy, it's a relatively small crate, and it probably wouldn't be too bad to have to maintain a fork or rewrite it. jq jaq are much heavier, and I wouldn't want to have to maintain forks or rewrite them.

In the case of #2532, the draft PR I have open for it adds simple shell parsing that we implement, so no external dependencies. And if there was a very mature Rust bourne shell implementation, whose goal was POSIX compatibility, I think I would be more comfortable with it, since it feels like less of a moving target than a jq clone.

In any case, wouldn't it still be possible that it is easier/better to incorporate something mature (the original /~https://github.com/jqlang/jq has been around for a while, I am not aware of any time that either jq or jaq has made a backward-incompatible change, and both have been quite heavily tested), but anyway incorporating a snapshot wouldn't be any different from what Apple did with a bunch of BSD code.

The example you gave is interesting.

I could see adding json_stringify (although with some tweaks, I would have a json_string function that takes one argument and produces an escaped string, and a json_array function that takes any number of arguments and produces a stringified array) and figuring out a way to let people pass it to an external jq, so we don't add a dependency.

@liquidaty
Copy link
Author

that same argument then apply to any external library that exposes a syntax defined by that library

Yes, definitely. For example, we use the dotenvy crate for dotfile parsing. But in the case of dotenvy, it's a relatively small crate, and it probably wouldn't be too bad to have to maintain a fork or rewrite it

I suppose you could say that about every library down to compression, SSL etc. At some point it takes more work to build and maintain homegrown code than someone else's code. But I understand that as the maintainer you have to be vigilant in keeping your yard to a manageable size.

An alternative, which I'm guessing is the direction you are going, is to avoid expanding the expression grammar and reduce everything to functions. However, doing so might make it very difficult to support a meaningful level of versatility.

Alternatively, a home-grown expression language isn't very hard, but you then choose between a) asking your users to learn yet-another expression syntax that differs from what they already know, or b) borrowing some pre-existing expression syntax in which case, if you don't re-use the original library that supported that syntax, you just end up duplicating some existing code base, which isn't very different from picking a version of an existing code base and internalizing it.

(as an aside, I hope you know I'm truly not trying to be argumentative here, just hoping to both expand the utility of this project (for my selfish purposes) in a manner that minimizes the short- and long-term burden, and so for that latter purpose, am describing what I truly believe will be the least burdensome and most value add. But I totally get that I'm not the one maintaining, I don't have the historical context to be in a credible position to opine, and it's not at all my call anyway)

I could see adding json_stringify (although with some tweaks, I would have a json_string function that takes one argument and produces an escaped string, and a json_array function that takes any number of arguments and produces a stringified array) and figuring out a way to let people pass it to an external jq, so we don't add a dependency

Sounds good, thanks for considering and lmk if I can do anything to help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants