Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to parse and run ble.sh #653

Open
andychu opened this issue Mar 10, 2020 · 66 comments
Open

Try to parse and run ble.sh #653

andychu opened this issue Mar 10, 2020 · 66 comments

Comments

@andychu
Copy link
Contributor

andychu commented Mar 10, 2020

Why?

  • it's a very substantial program using bash as a programming language, but it's not a batch program
  • it's probably one of the hardest scripts to run (probably won't get it to run in totality, but maybe some parts will)
  • It could be faster. maybe that will help optimize Oil.
    • I want Oil to be fast at char-by-char processing
  • Maybe it will help us figure out what kind of API we need to let other people build other interactive UIs
    • $LINES and $COLUMNS, etc.

/~https://github.com/akinomyoga/ble.sh

@andychu
Copy link
Contributor Author

andychu commented Mar 10, 2020

#620 was the original issue, already addressed return 2>&1 etc.

@andychu
Copy link
Contributor Author

andychu commented Mar 11, 2020

Two issues blocking parsing:

~/git/wild/ble.sh$ for name in */*.sh; do echo $name; ~/git/oilshell/oil/bin/osh -n $name | wc -l; done

...

memo/D1259.readbind.sh
84
out/ble.sh
            ((_ble_hook_c_$name=0))
                               ^
out/ble.sh:1644: Left-hand side of this assignment is invalid
0
src/benchmark.sh
2342
src/canvas.sh
    ((${_prefix}x=_pos[0]))
                 ^
src/canvas.sh:1521: Left-hand side of this assignment is invalid
0
src/color.sh
    (($_var=_ret))
           ^
src/color.sh:670: Left-hand side of this assignment is invalid
0
src/decode.sh
22515
src/def.sh
133
src/edit.sh
45409
src/history.sh
        ((_ble_builtin_history_rskip_dict[\$file]+=$2))
                                          ^
src/history.sh:701: Unexpected token in arithmetic context
0
src/util.sh
            ((_ble_hook_c_$name=0))
                               ^
src/util.sh:917: Left-hand side of this assignment is invalid
0

@andychu
Copy link
Contributor Author

andychu commented Mar 12, 2020

Note: can parse lib/core-syntax.sh (6651 lines, one of the biggest) by commenting out 5 instances of dynamic LHS in arith

  • src/history.sh (1671 lines) parses with a few edits

@andychu
Copy link
Contributor Author

andychu commented Mar 12, 2020

  • src/util.sh needs many patches (or relaxing parsing). This pattern is used all over the place.

Rules:

  • arith_expr.Word:

    • statically a var name -> treat as VarRef -- but do this at evaluation time?
    • otherwise evaluate as word, and then treat that as an integer constant
  • or just add Word on LHS as valid? And that must be an integer constant on RHS.

    • or on LHS it can be a var name?

@andychu
Copy link
Contributor Author

andychu commented Mar 12, 2020

@akinomygoa

On #640 I added a temporary hack shopt -s parse_unimplemented that parses dynamic arithmetic LHS but doesn't implement it. So at least this idiom won't prevent you from using osh -n. Now I can do:

for name in */*.sh; do echo $name; ~/git/oilshell/oil/bin/osh -O parse_unimplemented -n $name | wc -l; done

and there's only one remaining error:

src/history.sh
        ((_ble_builtin_history_rskip_dict[\$file]+=$2))
                                          ^
src/history.sh:701: Unexpected token in arithmetic context

If you want to use osh -n, maybe rewrite this without (( )) or with eval? I think this is particularly confusing because as we discussed \$file in this context does not mean '$file' -- it means something closer to "$file", correct?


And it should take 30 seconds to build osh from head on most Linux machines, although you can also wait until the next release:

/~https://github.com/oilshell/oil/wiki/Contributing

@akinomyoga
Copy link
Collaborator

Oh, I didn't notice your comment (but my user name is akinomyoga). I tried the latest commit which worked nicely! In testing I replaced the \$file line with eval '((...))', but I think I'll leave it as it is in the master branch for now because ble.sh is currently unlikely to run on Oil, so ble.sh is still only targeting Bash. Also, probably I'm too used to Bash syntaxes, but the \$file trick is natural to me. Nevertheless, I'll add some code comments above the line.

@akinomyoga
Copy link
Collaborator

akinomyoga commented Mar 14, 2020

I'm currently trying to source ble.sh in osh. ble.sh can be loaded without activation by passing the option --attach=none. Even if it doesn't fully work, I think it may be possible to just source ble.sh with minimal initialization. I'm currently using the following script to load ble.sh (with some modifications such as \$file)

# ble.osh
shopt -s parse_unimplemented
BASH_VERSION='5.0.11(1)-release'
BASH_VERSINFO=(5 0 11 1 release x86_64-redhat-linux-gnu)
set -o emacs
source "$(dirname "${BASH_SOURCE[0]}")/ble.sh" --attach=none

1. No line-editing mode by default ([[ -o emacs/vi ]])

The first one that I encountered is the line-editing mode detection. In Bash, the line editing mode can be tested with [[ -o emacs ]] and [[ -o vi ]]. When both of the tests fail, it implies that the line editing is turned off in Bash. ble.sh rejects the session with the line-editing turned off. I tested Oil and found that it implements the option, but both emacs and vi option is unset by default even though the twos cannot be unset at the same time by set -o/+o emacs/vi. In Oil, set +o vi implies set -o emacs, and set +o emacs implies set -o vi, and vice versa. What is the conceptual model for the state of the line-editing mode in Oil? I created a test for this.

test1.sh
#!/usr/bin/env bash

function print-line-editing-mode {
  local emacs=disabled vi=disabled
  [[ -o emacs ]] && emacs=enabled
  [[ -o vi ]] && vi=enabled
  echo "emacs $emacs, vi $vi"
}

echo -n "[default] "
print-line-editing-mode

echo -n "[set +o emacs] "
set +o emacs
print-line-editing-mode

echo -n "[set -o emacs] "
set -o emacs
print-line-editing-mode

echo -n "[set -o vi] "
set -o vi
print-line-editing-mode

echo -n "[set +o vi] "
set +o vi
print-line-editing-mode

echo -n "[set +o emacs] "
set +o emacs
print-line-editing-mode

set -o emacs
bash$ source test1.sh
[default] emacs enabled, vi disabled
[set +o emacs] emacs disabled, vi disabled
[set -o emacs] emacs enabled, vi disabled
[set -o vi] emacs disabled, vi enabled
[set +o vi] emacs disabled, vi disabled
[set +o emacs] emacs disabled, vi disabled

osh$ source test1.sh
[default] emacs disabled, vi disabled
[set +o emacs] emacs disabled, vi enabled
[set -o emacs] emacs enabled, vi disabled
[set -o vi] emacs disabled, vi enabled
[set +o vi] emacs enabled, vi disabled
[set +o emacs] emacs disabled, vi enabled

2. Comment: Array/scalar $FUNCNAME

The second one was $FUNCNAME which is actually the first array element ${FUNCNAME[0]} intentionally unsupported in Oil. I think the current Oil design is reasonable, but let me make a comment. The variable FUNCNAME was initially a scalar variable but extended to be an array in Bash 3.0, so it is a scalar variable and an array at the same time in Bash. This is the historical reason why the name is not FUNCNAMES but FUNCNAME even though it is an array. [ Note: Even for pure array variables, ble.sh uses this kind of parameter expansions in many places where the first element of an array contains the main result and the others contain additional information. ]

3. BUG: Start of ${@:offset} and ${*:offset} is shifted

The third one is that the results of ${@:offset} and ${*:offset} is shifted in Oil. I think this is a bug.

test2.sh

#!/usr/bin/env bash

set X Y Z W
echo "\$1=$1 \$2=$2 \$3=$3 \$4=$4"
echo "\${*:2}=(${*:2}) \${@:2}=(${@:2})"
osh$ source test2.sh
$1=X $2=Y $3=Z $4=W
${*:2}=(Z W) ${@:2}=(Z W)

bash$ source test2.sh
$1=X $2=Y $3=Z $4=W
${*:2}=(Y Z W) ${@:2}=(Y Z W)

zsh% source test2.sh
$1=X $2=Y $3=Z $4=W
${*:2}=(Y Z W) ${@:2}=(Y Z W)

ksh$ source ./test2.sh
$1=X $2=Y $3=Z $4=W
${*:2}=(Y Z W) ${@:2}=(Y Z W)

4. BUG: ${!prefix_@} in a function fails in the second invocation

See the following test case. Even though it succeeds for the first time, it fails for the second time.

test3.sh

#!/usr/bin/env bash

abc_hello=1
function hello1 { echo "${!abc_@}"; }
hello1
hello1
$ osh test3.sh
abc_hello
  function hello1 { echo "${!abc_@}"; }
                             ^~~~
test3.sh:4: fatal: Bad indirect expansion: ''

@andychu
Copy link
Contributor Author

andychu commented Mar 15, 2020

This is great, thanks for all the patches! About the first one, I just changed something there, and I agree the behavior is odd, but I see bash and OSH agree (at HEAD):

$ bin/osh --rcfile /dev/null -c 'test -o emacs; echo $?; test -o vi; echo $?'
1
1
$ bash  --rcfile /dev/null -c 'test -o emacs; echo $?; test -o vi; echo $?'
1
1

$ bash --version
GNU bash, version 4.3.48(1)-release (x86_64-pc-linux-gnu)

Do you get a different behavior?

I agree they are mutually exclusive but they both start off OFF. bash has a weird conceptual model but I think OSH copies it.

@akinomyoga
Copy link
Collaborator

Thank you for your comment!

I agree they are mutually exclusive but they both start off OFF. bash has a weird conceptual model but I think OSH copies it.

The default value in Bash depends on whether Bash is started in a non-interactive session or an interactive session. When you use the form sh -c '...', the shell starts in a non-interactive session while ble.sh works in an interactive session. In non-interactive sessions, test -o emacs/vi neither succeeds in Bash, which reflects the fact that the line editing is not activated in non-interactive sessions. This is reasonable. While, in interactive sessions, the emacs line-editing is activated by default, so it is reasonable that test -o emacs succeeds.

On the other hand, test -o emacs/vi neither succeeds in Oil by default even in the interactive session where the line-editing is enabled. The behavior of Oil in which test -o emacs/vi fails in both interactive and non-interactive sessions is strange.

By the way, the line-editing can be turned off in interactive sessions of Bash. The setting is set +o emacs. This is the reason why set +o emacs doesn't imply set -o vi in Bash unlike in Oil.

@andychu
Copy link
Contributor Author

andychu commented Mar 15, 2020

Ah OK I reproduced this with -i, see #659. If this is blocking more testing let me know

@akinomyoga
Copy link
Collaborator

Good news. I think I could finally source ble.sh without fatal errors (with many fixes). But of course, I still get many warning messages. Also, what has been done so far is just source without starting the ble.sh session, so ble.sh is still unlikely to run on Oil, but now it is possible to test some parts of ble.sh. I list up the problems that I encountered so far as follows:

5. INCONSISTENCY: Cannot create an empty associative array

Oil rejects declare -A dict=() with an error while it accepts declare -A dict; dict=(), which appears to be inconsistent. Note that declare -A dict and declare -A dict=() have different results in Bash. We want to have consistency. It is inconvenient that one cannot initialize an empty associative array

$ osh -c 'declare -A dict; dict=()'
$ osh -c 'declare -A dict=()'
  declare -A dict=()
             ^~~~~
[ -c flag ]:1: Got -A but RHS isn't an associative array

6. NYI (not yet implemented): exec <> a.txt fails without error messages

Even if it is not implemented, we want at least one error message (including the line number, etc.) instead of a stack trace. ble.sh uses this on named pipes (e.g. mkfifo a.fifo; exec <> a.fifo).

$ osh -c 'exec <> a.txt'
Traceback (most recent call last):
  File "/home/murase/prog/ext-github/oilshell.oil/bin/oil.py", line 962, in <module>
    sys.exit(main(sys.argv))
  File "/home/murase/prog/ext-github/oilshell.oil/bin/oil.py", line 908, in main
    return AppBundleMain(argv)
  File "/home/murase/prog/ext-github/oilshell.oil/bin/oil.py", line 881, in AppBundleMain
    status = ShellMain('osh', argv0, main_argv, login_shell)
  File "/home/murase/prog/ext-github/oilshell.oil/bin/oil.py", line 697, in ShellMain
    status = main_loop.Batch(ex, c_parser, arena, nodes_out=nodes_out)
  File "/home/murase/prog/ext-github/oilshell.oil/core/main_loop.py", line 166, in Batch
    is_return, is_fatal = ex.ExecuteAndCatch(node)
  File "/home/murase/prog/ext-github/oilshell.oil/osh/cmd_exec.py", line 1851, in ExecuteAndCatch
    status = self._Execute(node, fork_external=fork_external)
  File "/home/murase/prog/ext-github/oilshell.oil/osh/cmd_exec.py", line 1786, in _Execute
    if self.fd_state.Push(redirects, self.waiter):
  File "/home/murase/prog/ext-github/oilshell.oil/core/process.py", line 372, in Push
    if not self._ApplyRedirect(r, waiter):
  File "/home/murase/prog/ext-github/oilshell.oil/core/process.py", line 255, in _ApplyRedirect
    raise NotImplementedError(r.op_id)
NotImplementedError: 159

7. NYI: Missing feature exec {fd}> a.txt

This is a Bash-4.1 feature. ble.sh uses this for Bash 4.1 and later.

$ bash -c 'exec {fd}> a.txt; echo hello >&$fd; cat a.txt'
hello
$ osh -c 'exec {fd}> a.txt; echo hello >&$fd; cat a.txt'
  exec {fd}> a.txt; echo hello >&$fd; cat a.txt
       ^
[ -c flag ]:1: exec: '{fd}' not found

8. NYI: Missing feature closing file descriptors exec >&-

Without this, one cannot close file descriptors.

$ osh -c 'exec 5> a.txt; exec 5>&-'
  exec 5> a.txt; exec 5>&-
                         ^
[ -c flag ]:1: Redirect descriptor should look like an integer, got (value.Str s:-)

9. NYI: Missing feature of read delimiters read -d ''

$ osh -c 'read -r -d "" < a.txt; read -r -d "," < a.txt'
  read -r -d "" < a.txt; read -r -d "," < a.txt
          ^~
[ -c flag ]:1: 'read' doesn't accept flag -d
  read -r -d "" < a.txt; read -r -d "," < a.txt
                                 ^~
[ -c flag ]:1: 'read' doesn't accept flag -d

10. Error message of builtin read is not friendly

It says 'builtin' doesn't accept instead of 'read' doesn't accept.

$ osh -c 'builtin read -d ""'
  builtin read -d ""
               ^~
[ -c flag ]:1: 'builtin' doesn't accept flag -d

11. BUG: trap doesn't ignore --

I can find similar behavior for eval though I haven't checked the other builtins. In Bash, when one needs to pass an arbitrary value to trap / eval, one needs to specify -- because, for example, command=--help; eval "$command" will not be interpreted as expected.

$ osh -c 'trap "echo hello" INT'
$ osh -c 'trap -- "echo hello" INT'
  trap -- "echo hello" INT
          ^
[ -c flag ]:1: Invalid signal or hook 'echo hello'
$ osh -c 'eval -- echo hello'
  -- echo hello
  ^~
[ eval at line 1 of -c flag ]:1: '--' not found

12. NYI: Missing support of %()T in printf (unimplemented)

$ osh -c 'printf "%(%s)T\n"'
  %(%s)T\n
   ^
(source.ArgvWord word_spid:2):1: Invalid printf format character

13. BUG: ${var=x} is local

Even though an assignment of a shell variable var=x in a function scope takes an effect in the outer scope variable, the effect of ${var=x} is restricted in local. ble.sh uses this in some functions. Or do you have a rationale for this? Though it can be easily worked around, I don't see the reason for the design.

$ osh -c 'f() { : "${hello:=x}"; echo $hello; }; f; echo $hello'
x

$ osh -c 'f() { hello=x; }; f; echo $hello'
x

14. Bash/Oil differences

There are other warnings related to the features which I think are out of the scope of Oil.

14.1 Recursive arithmetic evaluation

The most warnings are caused by missing recursive evaluation described in #648.

14.2 Empty variables in arithmetic expressions

The second most type of warnings is this.

$ osh -c 'a=; echo $((a))'
[??? no location ???] warning: Invalid integer constant ''

14.3 Missing support of bind

It is used for setting up readline variables.

      ((_ble_bash>=40100)) && builtin bind 'set skip-completed-text on'
                                      ^~~~
./ble.sh:16300: 'bind' isn't a shell builtin
      ((_ble_bash>=40300)) && builtin bind 'set colored-stats on'
                                      ^~~~
./ble.sh:16301: 'bind' isn't a shell builtin
      ((_ble_bash>=40400)) && builtin bind 'set colored-completion-prefix on'
                                      ^~~~
./ble.sh:16302: 'bind' isn't a shell builtin

14.4 Output format of declare -p

This problem described in #647 already causes many warnings and strange behavior while source ble.sh --attach=none.

andychu pushed a commit that referenced this issue Mar 15, 2020
@andychu
Copy link
Contributor Author

andychu commented Mar 15, 2020

Thanks this is very useful. I just fixed point 13 -- it was an unintentional divergence.

read -d is #356

I'll respond to the rest a bit later. More bugs/testing is welcome!

@andychu
Copy link
Contributor Author

andychu commented Mar 15, 2020

Also what's the difference between declare -A dict and declare -A dict=() in bash?

By the way I will merge failing spec tests even if there's no fix yet. Showing the difference is often 50% of the work.

The way I find the relevant tests is to do grep foo spec/*.test.sh

@akinomyoga
Copy link
Collaborator

Also what's the difference between declare -A dict and declare -A dict=() in bash?

There are three different states for variables in Bash. In addition, the "set" state can be categorized into two sub-states:

  1. Not declared (slot is not allocated)
  2. Declared but unset (slot is allocated, but no value is set)
  3. Set (slot is allocated, and a value is set)
    1. Empty
    2. Non-empty

See the following test case for the behavior differences:

# test5.sh

function is-declared() { declare -p "$1" &>/dev/null; }
function is-enumerated-by-declare() { declare | grep -q "^$1="; }
function is-enumerated-by-varname-expansion() { eval "printf '%s\n' \"\${!$1@}\"" | grep -q "^$1\$"; }
function is-enumerated-by-compgen () { compgen -v | grep -q "^$1\$"; }
function is-enumerated-by-compgen-arrayvar () { compgen -A arrayvar | grep -q "^$1\$"; }
function show-status {
  local -a msg=(yes)
  local f
  for f in $(compgen -A function -- is-); do
    "$f" "$1"
    echo "$f: ${msg[$?]:-no}"
  done
}

echo "[undeclared]"
show-status dict
echo "[declared]"
declare -A dict
show-status dict
echo "[set]"
dict=()
show-status dict
$ bash test5.sh
[undeclared]
is-declared: no
is-enumerated-by-compgen: no
is-enumerated-by-compgen-arrayvar: no
is-enumerated-by-declare: no
is-enumerated-by-varname-expansion: no
[declared]
is-declared: yes
is-enumerated-by-compgen: no
is-enumerated-by-compgen-arrayvar: no
is-enumerated-by-declare: no
is-enumerated-by-varname-expansion: no
[set]
is-declared: yes
is-enumerated-by-compgen: yes
is-enumerated-by-compgen-arrayvar: yes
is-enumerated-by-declare: yes
is-enumerated-by-varname-expansion: yes

Related to this, in the help-bash mailing list, I have heard about some script library that gives a default array content when the definition is not given by the application. I'm not sure if it is a good design, but in that library, the empty array is one of the possible configurations, so one needs to distinguish the unset state (declare -A dict) from the empty state (declare -A dict=()). Also, if the array is associative, the library wants to declare the associative array attribute in advance. Such a script can be achieved in the following way:

# library
declare -A dict
is-array-set() { compgen -A arrayvar -X "!$1" -- "$1" &>/dev/null; }

function initialize-lib {
  if ! is-array-set dict; then
    # default dictionary
    dict=([left]=right [up]=down)
  fi
}


# application

# ... set up dict here ...

initialize-lib

@akinomyoga
Copy link
Collaborator

I'm reorganizing the patch set of ble.sh for source in Oil.

Although I have listed many points above, the highest priority for me is No. 5 declare -A dict=() which scatters around several files. If this is supported by Oil, the most changes can be confined into only ble.pp and util.sh.

Also, I have three additional items as follows:

15. Q: How can I get OSH version of the currently running instance in scripts?

There are many differences between Bash and Oil, and also many differences among different versions of Oil. In order to provide the best implementation for each version, I need to get the version information of Oil akin to BASH_VERSION and BASH_VERSINFO. I observed the output set but I could not find related variables. In particular, an array similar to BASH_VERSINFO is useful because one cannot rely on any fancy string manipulations to parse version strings without knowing the versions.

FYI: ble.sh exposes two variables BLE_VERSION and BLE_VERSINFO.

$ declare -p BLE_VERSION BLE_VERSINFO
declare -- BLE_VERSION="0.4.0-devel2+420c933"
declare -a BLE_VERSINFO=([0]="0" [1]="4" [2]="0" [3]="420c933" [4]="devel2" [5]="noarch")

16. NYI: read -t timeout (or builtin sleep)

ble.sh uses read -t timeout to implement sleep less than a second. The POSIX sleep does not require to support sleep with the resolution finer than one second. The sleep command of GNU coreutils supports milliseconds resolution of sleep, but this is GNU extension. ble.sh uses builtin sleep when the Bash loadable builtin sleep is available and uses read -t timeout when it supports fractional seconds of timeout (Bash 4.0 and later). Oil doesn't support either.

Note: read -t timeout behaves in a special matter when timeout is 0, i.e., read -t 0 and read -t timeout are different features in Bash.

17. BUG: ${arr[0]=1} destroys the array

Thanks this is very useful. I just fixed point 13 -- it was an unintentional divergence.

Thank you for the fix! But the fix c0f9c2f revealed another bug. See the following test case.

$ osh -c 'arr=(); echo ${#arr[@]}; : ${arr[0]=1}; echo ${#arr[@]}'
0
  arr=(); echo ${#arr[@]}; : ${arr[0]=1}; echo ${#arr[@]}
                                               ^~
[ -c flag ]:1: fatal: Can't index string with @

@akinomyoga
Copy link
Collaborator

akinomyoga commented Mar 15, 2020

I think I have reported too many. Not everything is so important for ble.sh. I summarize the current status and which one I think is useful for ble.sh.

Edit: The list has been moved to Running ble.sh With Oil - Wiki.

Current status (★ = `ble.sh` will be happy)

@andychu
Copy link
Contributor Author

andychu commented Mar 15, 2020

note: marked #537 as implemented above, will be out with next release

@andychu
Copy link
Contributor Author

andychu commented Apr 6, 2020

OK I just fixed #660 ${arr[0]=x}. And I filed issues for the other ones, we can discuss there.

I think we probably don't have compgen -A arrayvar. It might be preferable to implement that first, since if it's older, more scripts will be using it. I only implementd ${var@P} because it was useful for unit testing the prompt.

@akinomyoga
Copy link
Collaborator

Thank you for continually fixing the problems!

And after that I'm still interested in #663, e.g. providing better APIs. I think one thing you mentioned could be

eval -g -- 'echo $x'   # evaluate in global scope, not local

?? It might make it easier for others to write a line editor.

ble.sh also restores and saves many other shell settings before and after the execution of user commands (including stty, set -exvu -o posix -o emacs -o vi, shopt -s nocasematach, IGNOREEOF, IFS, BASH_REMATCH, FUNCNEST, $?, $_, etc.). Maybe it is better to provide command execution in an independent environment but not in a global context shared with the line editor (like JavaScript object trees in different pages in a browser).

Also, I think we may consider the separation of the execution environment of the line editor and the user commands. ble.sh is completely implemented in a user space, so it provides complete flexibility allowing users to change any part of ble.sh by shell scripts. But this flexibility is a double-edged sword. It is possible to completely break ble.sh session by overwriting the shell functions of ble.sh. Also, ble.sh pollutes the shell variable namespace by defining a bunch of shell variables named _ble_*. Maybe it is useful to provide some mechanism to protect/separate several execution environments as well as a special way of defining shell variables/functions in an environment from another environment.

Note: But ble.sh is not implemented in that way, so the execution environment separation would be only useful for implementing entirely new line editors.


I think anything that has automated tests is a good candidate, i.e. so we can get a PASS/FAIL without having to interactively reproduce it.

There's no rush... I will be busy doing C++ translation for awhile. But I'm interested in more bug reports, etc.

As I have already mentioned a little in #687, I'm now adding tests in lib/test-util.sh. But I noticed that there are many NYI (not yet implemented) features that ble.sh relies on whether or not if they are inevitable in implementing the interactive feature. So, the report from the automated test is likely to become mainly a list of NYIs rather than a list of bugs.

Do you think a long list of NYIs useful for Oil? I have read #701 (comment) saying "Any spec tests that reveal a difference between Oil and other shells are accepted", but I'm hesitating because most of them are unlikely to be supported in Oil in the future. Or, maybe it doesn't matter if we confine such spec tests in ble-idioms?

@andychu
Copy link
Contributor Author

andychu commented Apr 8, 2020

Maybe it is better to provide command execution in an independent environment but not in a global context shared with the line editor (like JavaScript object trees in different pages in a browser).

Yes! Absolutely. Oil can do this, and that's one of the main differences vs. other shells. The interpreter is re-entrant and you could even run multiple isolated interpreters in multiple threads, and remove the I/O from some of them.

We will have think of a good API for spawning another interpreter. Because ble.sh stlil has to access some of the original interpreters settings I think -- it's not completely isolated


How about putting all the NYI features here? I'm curious what is left. I think bind is a big one that I'm not sure about, and traps are a big deal, but otherwise I think we handled most of them pretty easily.

/~https://github.com/oilshell/oil/wiki/Running-ble.sh-With-Oil

I think it ble.sh uses a bash feature, we should try to either implement it or provide some alternative. It might take a long time to do it, but maybe we will get some help along the way too from other people interested.

@andychu
Copy link
Contributor Author

andychu commented Apr 8, 2020

Also feel free to start spec/ble-features if you want as well. We can move them into other files later.

As I said I don't mind if they are failing -- it's better to know about it than not to know about it. It might not be done for many months but that's OK. I hope other people will be interested and help.

@andychu
Copy link
Contributor Author

andychu commented Apr 8, 2020

"Any spec tests that reveal a difference between Oil and other shells are accepted", but I'm hesitating because most of them are unlikely to be supported in Oil in the future. Or, maybe it doesn't matter if we confine such spec tests in ble-idioms?

About this point, I guess I would say there are different categories:

  1. Oil implements a feature, but it behaves differently than bash or other shells. This is either:
    • a bug tagged "divergence" or
    • a known difference that should be documented.
  2. Oil doesn't implement something that a program like ble.sh uses. Either:
    • It should be Oil. Spec tests help decide what the exact behavior is, which is often not clear.
      • note: we should have spec tests even if we don't have time to implement it right now. The tests help in a lot of ways including design.
    • It shouldn't be in Oil, maybe because the feature is broken in bash (although it's hard to think of that many of these -- usually we implement something roughly compatible with usage.)
      • Oil should provide some alternative.
      • Or hide it behind an option like shopt -s eval_unsafe_arith.

So I guess if you make a short list of what's NYI (e.g. on the wiki or here), I can probably give you an idea of whether I think Oil should implement it. I think most things Oil should implement, even though we may not have time to do it right now.

I guess let is one category of thing where I think there's a trivial rewrite. But that just means it's lower priority, not necessarily that it will never be done.

A feature definitely can't be done until there's a spec test since I won't know how it behaves. In the case of let I'm pretty sure I know how it works though, but I could be convinced otherwise. I can't think of any case where let can't be replaced by eval easily. (And to me the code is clearer that way too.)

@akinomyoga
Copy link
Collaborator

akinomyoga commented Apr 10, 2020

Thank you! I tested about 40% of src/util.sh. I wrote that the list is mainly NYI, but it turned out there are still many bugs. First I list up NYI and bugs here, and then I think I summarize them in Wiki after some discussion if any.

26. NYI: Dynamic unset

When unset is used for the variables not defined in the current (function) scope, unset becomes dynamical, i.e., unset removes the "cell" found first in the call stack rather than set the Undef value. Note that unset becomes static when used for the locally defined variables, i.e., unset set Undef.

$ cat test.sh
function unlocal { unset "$@"; }
function check4 {
  hello=global

  local hello=local
  echo $hello

  unlocal hello
  echo $hello
}
check4

$ bash test.sh
local
global
$
$ osh test.sh
local

$

27. NYI: flags rx in ${var@a}

28. NYI: declare -iluc and flags iluc in ${var@a}

29. BUG: Cannot parse ${arr[@]::}

[ Note: omitted number means 0 in Bash, so ${arr[@]::} is equivalent to ${arr[@]:0:0} ]

$ bash -c 'a=(1 2 3); echo ${a[@]::}'

$ osh -c 'a=(1 2 3); echo ${a[@]::}'
  a=(1 2 3); echo ${a[@]::}
                          ^
[ -c flag ]:1: Token can't be used in prefix position

30. BUG: ${arr[@]::0} prints all the elements

$ bash -c 'a=(1 2 3); echo ${a[@]::0}'

$ osh -c 'a=(1 2 3); echo ${a[@]::0}'
1 2 3

31. COMPAT: shopt for $ARRAY? ★

There are many places in ble.sh where the first element of an array ${ARRAY[0]} is accessed through $ARRAY. In particular, such usage is common in the case in which the first element of the array contains a main value, and the other elements contain additional information just like the case of $FUNCNAME, $BASH_LINENO and $BASH_SOURCE. Also, there is a bug in Bash 3.1 that ${#ARRAY[0]} counts the number of bytes rather than characters, so ble.sh uses ${#ARRAY} as a workaround.

32. NYI: LC_CTYPE (or Binary manipulations)

Oil does not support character encoding specified by LC_CTYPE.
In particular, ble.sh uses LC_CTYPE=C for binary manipulations.

$ bash -c 'v=α; echo "nchar=${#v}"; LC_CTYPE=C; echo "nbyte=${#v}"'
nchar=1
nbyte=2
$ osh -c 'v=α; echo "nchar=${#v}"; LC_CTYPE=C; echo "nbyte=${#v}"'
nchar=1
nbyte=1

33. BUG: Redirection of 2 is persistent after : 2>/dev/null >&30

$ bash -c ': 2>/dev/null >&30; echo hello >&2'
hello
$ osh -c ': 2>/dev/null >&30; echo hello >&2'
$ osh -c ': 2>/dev/null >&30; ls -la /proc/$$/fd/2'
l-wx------. 1 murase murase 64 2020-04-10 05:55:18 /proc/29290/fd/2 -> /dev/null
$

34. NYI: $BASHPID

ble.sh wants an ID for subshells.

35. COMPAT: Oil session closed by exec non-existent-command

The behavior diverges among shells. bash, yash, fish, tcsh and csh are not closed by exec failure. zsh, mksh, ksh, dash, busybox, posh are closed by exec failure.

bash$ exec 123
bash: exec: 123: not found
bash$
<!-- Still in the interactive session of Bash -->
bash$ echo $?
127

osh$ exec 123
  exec 123
       ^~~
[ interactive ]:1: exec: '123' not found
<!-- Here the interactive session of Oil closes -->

36. BUG: read fails on empty lines ★★

This is serious. The widely-used idiom while read -r line; do ... done cannot process files that have empty lines.

$ bash -c 'echo | read; echo $?'
0
$ osh -c 'echo | read; echo $?'
1

37. BUG: {fd}>&- does not close the fd

This bug is introduced by 536f350.

osh$ exec {fd}>/dev/null
osh$ ls -la /proc/$$/fd/$fd
l-wx------. 1 murase murase 64 2020-04-10 10:31:52 /proc/7702/fd/100 -> /dev/null
osh$ exec {fd}>&-
osh$ ls -la /proc/$$/fd/$fd
l-wx------. 1 murase murase 64 2020-04-10 10:31:52 /proc/7702/fd/100 -> /dev/null

@andychu
Copy link
Contributor Author

andychu commented Apr 10, 2020

Great thanks for the testing. I will look at all of them but here are some quick comments:

  1. rx in ${var@a} That should be fixed now with 14548dc
  2. I think this is bug ${@:0:1} evaluates to ${@:0} #688 which I just fixed ?
  3. A compatibility option is possible but there are some other bugs in this area I want to fix first, and that code is getting messy, so some refactoring like consider an IR for words and ${} #604 may help

@andychu
Copy link
Contributor Author

andychu commented Apr 14, 2020

OK I fixed #705, thanks for the reports :)

@andychu
Copy link
Contributor Author

andychu commented Apr 14, 2020

Summarizing.

Redirect bugs

  • 33. Redirection of 2 is persistent after : 2>/dev/null >&30
  • 36. {fd}>&- does not close the fd

NYI

Comment on 35, I believe this is POSIX behavior for "special builtin". bash may implement under set -o posix. set is also a special builtin, so see:

$ bash -c 'set -z; echo $?'
bash: line 0: set: -z: invalid option
set: usage: set [-abefhkmnptuvxBCHP] [-o option-name] [--] [arg ...]
2

$ bash -o posix -c 'set -z; echo $?'
bash: line 0: set: -z: invalid option
set: usage: set [-abefhkmnptuvxBCHP] [-o option-name] [--] [arg ...]
<does NOT continue and print exit code>

@andychu
Copy link
Contributor Author

andychu commented Apr 14, 2020

For #26, the unset behavior, is ble.sh relying on this? There are two different behavior in shells:

  • bash, mksh, yash behave the same. They remove the "cell"
  • dash/ash, zsh, and currently osh. The cell is set to Undef.

I'm not sure if POSIX says anything about which behavior is right...

@akinomyoga
Copy link
Collaborator

akinomyoga commented Apr 15, 2020

Thank you for the comments! I moved the list to the Wiki page.

Re: 27. NYI: flags rx in ${var@a}

  1. rx in ${var@a} That should be fixed now with 14548dc

Thank you for the fix! It actually solved the problem. I'm sorry that I forgot to re-check it with the latest commit.

Re: 30. BUG: ${arr[@]::0} prints all the elements

  1. I think this is bug ${@:0:1} evaluates to ${@:0} #688 which I just fixed ?

I guess this is the comment for 29 and 30. It doesn't fix the problem of either 29 or 30. I confirmed again with the latest commit e401363.

Re: 35. COMPAT: exec non-existent-command

Comment on 35, I believe this is POSIX behavior for "special builtin".

Thank you for the information. I found that bash exits with exec failure in non-interactive shells while it does not in interactive shells. I tried -o posix options, but it did not change the behavior.

# Non-interactive shells: Bash exits

$ bash -c 'exec 1234; echo hello'
bash: line 0: exec: 1234: not found
$ bash -o posix -c 'exec 1234; echo hello'
bash: line 0: exec: 1234: not found

# Interactive shells: Bash doesn't exit

$ bash --norc
$ exec 1234; echo hello
bash: exec: 1234: not found
hello
$ set -o posix
$ exec 1234; echo hello
bash: exec: 1234: not found
hello

Also, yash aiming to be compatible with POSIX does not exit with exec failure in interactive shells. I looked into the POSIX, but it appears that there are two conflicting descriptions. Anyway, it is annoying that a tiny typo can cause the exit of interactive sessions, which I think is the reason why the interactive shells doesn't exit with errors of special builtins.

POSIX XCU 2.8.1 Consequences of Shell Errors (I extracted related lines of the table. Boldfaces are by me)

Certain errors shall cause the shell to write a diagnostic message to standard error and exit as shown in the following table:

Error Non-InteractiveShell Interactive Shell Shell DiagnosticMessage Required
Special built-in utility error shall exit shall not exit no1
Other utility (not a specialbuilt-in) error shall not exit shall not exit no2
Command not found may exit shall not exit yes

POSIX XCU 2.14 Special Built-In Utilities - exec - EXIT STATUS (Boldface is by me)

EXIT STATUS

If command is specified, exec shall not return to the shell; rather, the exit status of the process shall be the exit status of the program implementing command, which overlaid the shell. If command is not found, the exit status shall be 127. If command is found, but it is not an executable utility, the exit status shall be 126. If a redirection error occurs (see Consequences of Shell Errors), the shell shall exit with a value in the range 1-125. Otherwise, exec shall return a zero exit status.

Re: 26. Dynamic unset

For #26, the unset behavior, is ble.sh relying on this?

Yes. It uses this in several places. Also, the Bash idiom Freddy Vulto's upvar trick relies on this behavior. bash-completion uses this trick.

There are two different behavior in shells:

Actually, the situation is more complicated. The behavior of Bash depends on whether unset is applied to local-scope variables (variables defined in the same function) or dynamic-scope variables (variables defined in the contexts of callers). Dynamic in the following table means to remove the cell, and static to set Undef.

Shell Local-scope variables Dynamic-scope variables
mksh, yash Dynamic Dynamic
bash Static Dynamic
dash/ash, zsh, bash (shopt -s localvar_unset) Static Static
  • Note: shopt -s localvar_unset is available from Bash 5.0.

I'm not sure if POSIX says anything about which behavior is right...

POSIX doesn't define local variables, so it doesn't define the behavior of unset for local variables.

andychu pushed a commit that referenced this issue Apr 15, 2020
Implements a feature mentioned in #653 (ble.sh).
andychu pushed a commit that referenced this issue Apr 15, 2020
It matches bash, mksh, and yash now.  Instead of dash/ash or zsh.

Addresses an issue in #653 (ble.sh).

However this makes some tests in test/spec.sh assign fail.  See the "if
0".
@andychu
Copy link
Contributor Author

andychu commented Apr 15, 2020

  • I implemented $BASHPID.
  • I tried out the different behavior for unset in the last commit. See the if 0 line.
    • Right now Oil sets the cell to Undef.
    • I tried out the behavior of deleting the cell -- just a one line change -- and it makes Oil behave like bash and zsh for test case 24 in spec/builtin-vars.

However this also makes some other tests fail -- ones that I added based on a report by @mgree last year. It fixed a bug with "temp bindings":

FOO=bar func   # This creates another frame and you can unset it?

I don't remember the details right now though... But it is even more complicated than local scope and dynamic scope, because you also have these "temp bindings" to worry about. Can they be unset and can they shadow vars higher in the stack?

Dynamic in the following table means to remove the cell, and static to set Undef.

This statement and the table below doesn't exactly match what I observe... maybe you can add some test cases below #24 in spec/builtin-vars.test.sh ?

Is the bash 5.0 behavior with localvar_unset OK? I would rather keep it simple rather than have too many special cases, but still compatible. It's hard to write documentation when there are too many special cases.

I will look at it more later but having a test case for exactly what's used in ble.sh will help. Does the behavior as of the last commit make it work? If so then I just have to remember what the other test cases were for.

I think I made it match dash because it was a simpler rule to implement, and it wasn't for running any real code.

@andychu
Copy link
Contributor Author

andychu commented Apr 15, 2020

This is the issue from Michael Greenberg, who is working on POSIX shell (Smoosh, the executable formal semantics). So even though POSIX doesn't have locals, it has temp bindings, and this also affects unset!

#329

I didn't refresh my memory of this fully yet, but I think we can work it out between these two issues ... this is why we have the test cases test/spec.sh assign -r 24-27, noted in the commit

aac8712


Looking at case 24 only, I think I went with that behavior because only dash and zsh agree. Every other shell disagrees to an extent.

Since people rely on the bash idiom, it's probably better to change it to be closer to that, but I'm not sure exactly how. So yeah having the exact test cases for ble.sh will help, e.g. in spec/ble-idioms.tset.sh.

@andychu
Copy link
Contributor Author

andychu commented Apr 17, 2020

BTW the case from ble.sh is test/spec.sh builtin-vars -r 24

The other ones are test/spec.sh assign -r 24-27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants