-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathstateoftheart.tex
1034 lines (891 loc) · 73 KB
/
stateoftheart.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Historically, automatic analysis techniques for parallel programs focused on the analysis of models of systems.
A programmer wishing to use such a tool would either start by creating a model
of the system (in the specification step of the development), and then provide
an executable implementation for this model or, if they already had a working
product, they would have to create a model to analyse it.
Such tools include for example SPIN~\mcite{Holzmann1997}, older versions of DIVINE~\mcite{BBCS05}, LTSmin~\mcite{Kant2015} and BMC~\mcite{Biere1999}.
The model-based approach requires an extra investment in the modelling phase
and, even if analysis of the model concludes it is correct, it does not prove
that the final product is indeed correct.
Later, with improvements of both analysis techniques, as well as overall
improvements in available computing power, analysis tools for programs written
in mainstream programming languages become available.
Early examples of such tools are Java Pathfinder~\mcite{Visser2003} (an explicit-state model checker for Java) and CBMC~\mcite{Clarke2004} (a SAT-based bounded model checker for C).
Since 2012, the Software Verification Competition (SV-COMP) \mcite{Beyer2020svc}
aims to showcase tools that support direct verification of software written in
C and lately also Java.
While it includes mostly sequential programs, there is also a subcategory for
parallel C programs in SV-COMP.
We will focus in more details on automatic techniques for verification of
parallel programs.
We will not consider program analysis techniques which require substantial manual effort (e.g., proof-assistant-based techniques), or techniques which are not applicable to realistic programs (e.g., techniques which use a modelling language).
We will also mostly disregard techniques with no support for parallel programs.
Finally, we are primarily concerned with techniques that were implemented and evaluated -- the existence of a tool that can handle a programming language (as opposed to a modelling language) can be seen as a witness of maturity of a technique.
\section{Explicit-State Model Checking}
Explicit-state model checking is based on an exhaustive exploration of the
state-space graph.
It checks that a given (finite-state) system satisfies given property.
The property is often provided by an LTL formula and the automata-based approach
to LTL verification is used (i.e., the problem is reduced to the problem of
repeated reachability of a state in the state-space graph)~\mcite[\S
5.2]{Baier2008}.
In the particular case of safety properties, it is sufficient to perform graph
search for a state which violates the safety property, for example, using the
depth-first search algorithm, or any other graph search.
The advantage of explicit-state model checking is that it is conceptually easy
to apply it to verification of parallel programs (under sequential
consistency), as the interleaving semantics of threads naturally gives rise to
the state space graph.
Furthermore, explicit-state model checking does not require the program to terminate; it is sufficient that its state space is finite.\mnote{A finite state space can contain cyclical infinite behaviour -- a loop in the state space.
\medskip
\begin{tikzpicture}[>=latex,>=stealth',auto,node distance=2cm,semithick,initial text=]
\node[state,initial] (t) {$x$};
\node[state] (f) [right of = t] {$\lnot x$};
\path[->, shorten >=1pt]
(t) edge[bend left] (f)
(f) edge[bend left] (t)
;
\end{tikzpicture}
}
\paragraph{State Space Explosion \& Reduction Techniques}
In practice, explicit-state model checking is prone to the \emph{state-space
explosion} problem: the number of states in the state-space graph of a
reasonable system can easily be so big it is not possible to store the state
space in available memory and explore the state space in a reasonable time.
Since the algorithm which explores the state-space graph needs to detect which
states were already seen to ensure termination (and to check for LTL
properties), available memory is often the limiting factor, at least without
advanced state space reduction techniques.
Several state-space reduction techniques were introduced to mitigate the state-space explosion.
Using these techniques, it is possible to explore only some of the states of
the state space in such a way that the property holds for these states if and
only if it holds for the entire state space.
One of these techniques is \emph{Partial Order Reduction} (POR) which
can eliminate some states by exploring independent events only in one
particular order \mcite{Peled1993,Godefroid1996partial}.
Another wide family of reductions are techniques that can coalesce a path in a
state space into a single edge and hide all intermediate states.
Lipton introduced an early example of this idea in \mcite{Lipton1975}.
He used the notion of right movers (``resource acquire operations'') and
left movers (``resource release operations'') to identify statements in a
program which can be executed atomically.
However, in the analysis of realistic parallel programs, a notion of instruction (or action) visibility is often used.
A group of instructions from one thread can
be executed atomically, provided that at most one of them is observable by the
other threads, and that this grouping does not interfere with checking of the
verified property or termination of the search.
Both static and dynamic reduction methods were proposed, under many names,
including \emph{D-reduction} \mcite{Lipton1975}, \emph{path reduction}
\mcite{Yorav2004}, \emph{$\tau$ reduction} \mcite{BBR2012} for the static
variants and \emph{$\tau+$ reduction} \mcite{RBB13} and \mcite[Section
6]{RSCB2018} for the dynamic once.
These reductions are also often used without naming the technique, for example in
Java Pathfinder~\mcite{Visser2003}.
Quite naturally, the dynamic methods are better suited for the complex control-
and data-flow of realistic programs.
Both partial order reductions and path reductions are also often used in tools
based on other principles then explicit-state exploration.
Additionally, there are variants of \emph{symmetry reduction} that reduce the
state space by coalescing states which differ only in properties not relevant
for the program analysis \mcite{Clarke1998}.
For example, using the \emph{heap-symmetry} reduction, two states that differ
only in the order in which memory objects were allocated can be considered
equal \mcite{RBB13}, \mcite[Chapter 6]{RSCB2018}.
A dead-variable elimination \mcite{Yorav2004} can also be seen as an instance
of symmetry reduction.
These state-space reduction techniques can reduce the number of states by
several orders of magnitude \mcite{RBB13}, and therefore enable verification of
realistic (but still relatively small) programs.
Interestingly, the proposed reduction techniques which aim to preserve verified
properties are not always correct, for example the original version of $\tau+$
reduction considered only stores to be visible, but it was later shown that
repeated loads have to be also visible (shown by me in \mcite{S2016}, fixed in
\mcite[Section 6]{RSCB2018}).\begin{marginnote}%
Consider a program with threads \texttt{t0} and \texttt{t1}:
\medskip
\begin{cppcode}
int x = 0;
void t0() {
x = 1;
}
void t1() {
int a = x;
assert(a == x);
}
\end{cppcode}
\smallskip
If the two loads in \texttt{t1} are not considered visible, they are coalesced,
and the assertion violation is missed.
\end{marginnote}
%
A similar problem was present in \mcite{Cordeiro2011}, where ESBMC could
perform two conditional jumps that read the same shared variable with no
context switch between them (section 3.1, rule R3).
Interestingly, the authors notice the possibility of missing context switches
but only introduce an option to fix it, leaving the problematic behaviour as
default.
To further improve the capabilities of explicit-state model checking, several
techniques for memory-efficient representation of the set of visited states
were introduced.
These techniques include \emph{hash compaction} and \emph{bitstate
hashing} \mcite{Holzmann1998}, which are incomplete techniques that store
hashes of states instead of storing the entries states (and therefore can omit
some parts of the state space if there is hash collision),
\emph{conditional and external storage of states} \mcite{Hammer2007}, and
\emph{lossless compression techniques} \mcite{RSB15TC,Laarman2019}.
\paragraph{Data Nondeterminism}
While explicit-state model checking can easily represent control-flow
nondeterminism, it is not well suited for data nondeterminism, as it is not
practical (or even possible) to explicitly enumerate all possible values of
data domains.
Therefore, if data nondeterminism is required, explicit-state model checking
needs to be combined with some technique for symbolic of abstract data
representation \mcite{Pasareanu2013,MBLB2016}.
\paragraph{Tools}
The pioneering tool is the SPIN LTL model checker
\mcite{Holzmann1997,Holzmann2004}.
SPIN has very limited support for analysis of realistic programs.
It targets a parallel modelling language PROMELA, which has support to embed C
code to define an atomic step.
In~\mcite{Zaks2008} SPIN was extended to have partial support for C.
Even so, this version needs to have a test driver in PROMELA.
Java Pathfinder (JPF)~\mcite{Visser2003,Artho2019} is an explicit state model
checker (with symbolic extensions) for Java and other JVM-based\mnote{Java is
not executing directly on the target hardware but instead uses an intermediate
language which is run by JVM.} languages (e.g., Scala, Kotlin).
From its beginning, JPF targets parallel Java programs.
It can check for safety properties, namely for uncaught exceptions which in Java also subsume assertion checking and bound checking.
To reduce the state space, JPF uses hash-compaction (and therefore under-approximates all possible behaviours), symmetry reduction (with respect to class loading order and heap symmetry), optionally predicate abstraction (with user-provided predicates), partial order reduction, and groups instructions with local effects.
Interestingly, the instruction grouping in JPF uses a heuristic which can cause it to miss some behaviours.
JPF also has models of parts of the Java standard library (including limited IO
support) and limited support for execution of native code from the Java
code\mnote{Java programs can use the Java Native Interface (JNI) to interfere
with native code, for example, to access operating system primitives
directly.}
In addition to scheduling nondeterminism, JPF can use explicit choice as an
additional source of nondeterminism.
The symbolic extension, Symbolic Pathfinder (SPF) \mcite{Pasareanu2013,Noller2019} adds
support for symbolic data representation using symbolic execution.
It supports test generation and detection of assertions and errors related to parallel execution.
It primarily targets unit tests and sub-system level testing and can use unit
preconditions and combine symbolic and explicit execution.
To explore different possible interleavings, SPF uses the explicit JPF without
state comparison and with depth bound to ensure termination (the state
comparison can be enabled, but it disregards the symbolic data and therefore
can miss many behaviours).
SPF can handle some instances of dynamically allocated
linked symbolic data by \emph{lazy initialisation} -- when a symbolic pointer
is accessed, it can be expanded to either null pointer or another node of
symbolic data \mcite{Khurshid2003}.
SPF has support for symbolic arrays \mcite{Fromherz2017} and symbolic strings \mcite{Bang2016}.
An unnamed explicit-state model checker for C\# is presented in \mcite{Huynh2006}.
It targets a subset of .NET bytecode (which is an intermediate representation into which C\# and other .NET languages are translated), and it analyses parallel programs running under the .NET relaxed memory model.
It can be used to detect behaviour that differs from any possible sequentially consistent behaviour and to insert memory barriers to recover sequential consistency.
% The implementation of the exploration algorithm uses a list of delayed instructions to implement instruction reordering.
DIVINE \mcite{DIVINEToolPaper2017} is an explicit-state model checker developed
by our research group.
Historically, it targeted several modelling languages for parallel systems and
LTL verification using parallel and distributed algorithms, but later it
shifted towards the analysis of C and C++ using the LLVM intermediate representation.
While it now aims primarily at verification of safety properties (assertion
violations, memory access safety, detection of use of undefined variables,
detection of memory leaks, numeric manipulation errors, and deadlock-freedom for
the POSIX mutexes), it also has limited support for LTL and an
extension for detection of nontermination in parallel programs
(\autoref{chap:lnterm}, \mcite{SB2019}).
To tackle realistic programs, DIVINE uses a dynamic detection of invisible
actions ($\tau+$ reduction) and an efficient representation of program memory
which facilitates heap-symmetry reduction and state-space
compression \mcite{RSCB2018}.
DIVINE also supports symbolic and abstract data representation using program
transformations \mcite{LRB2018}.
Currently, it supports symbolic bitvector manipulations for both integral and floating-point data types and symbolic string representation \mcite{CLOR2019}.
With the symbolic data representation using bitvectors, DIVINE uses SMT solvers
to check for feasibility of traces and to compare symbolic states -- this way
it retains the ability to join states that are semantically equivalent even if
the symbolic data are represented by a different formula.
The state comparison allows DIVINE with symbolic data to ensure termination for
programs with finite symbolic state space, and it also enables checking LTL
properties.
One of the main goals of DIVINE is to support verification of C and C++
programs which use existing libraries.
To this end, DIVINE has almost complete standard C and C++ libraries (as of
C++17), the POSIX thread library (\texttt{pthreads}), and it also supports C++
exceptions (\autoref{chap:lang}, \mcite{SRB2017}).
To reflect the behaviour of parallel programs on contemporary hardware, DIVINE
has support to analyse programs with respect to the \xtso memory
model (\autoref{chap:mm}, \mcite{SB2018x86tso}).
DIVINE also has an in-built compiler based on LLVM's clang and supports
significant parts of POSIX file system and process APIs \mcite{RBMKB2019}.
SymDIVINE \mcite{MBLB2016} was another explicit-state model checker from our research group.
It combines explicit control flow handling with a symbolic representation of data
using bitvectors (with symbolic state equality).
SymDIVINE targets safety and LTL properties in C programs.
This tool is now discontinued in favour of the aforementioned symbolic data support
in DIVINE.
% LTSmin~\cite{Kant2015}
% - probabilistic, timed systems
% - multi-core LTL model checker, partial order reduction
% - multi-core symbolic checking for $\mu$-calculus
% - multiple modelling formalism, but no realistic programming languages
% - state space compression
% - paralell LTL search
\section{Stateless Model Checking}\label{sec:stateoftheart:smc}
Compared to explicit-state model checking, \emph{stateless model checking}
(SMC) avoids storing the set of visited states and therefore has decreased
memory consumption.
Furthermore, since the state representation is not required to be as compact as
possible, a stateless model checker can have a simpler representation of states.
Stateless model checking was introduced in \mcite{Godefroid1997}, and it aims
at safety analysis of terminating realistic parallel programs.
A stateless model checker usually explores the state space in a depth-first
manner, and it can explore some parts of the state space multiple times (since
it does not store the set of visited states).
Therefore, the requirement that the input program terminates is necessary to
ensure the analysis terminates.
In practice, this requirement is often ensured by imposing loop iteration
bounds.
If the program under test requires more iterations of a loop, the loop bound
can be increased, or the analysis can be terminated as inconclusive.
Stateless model checking is also sometimes presented under names like \emph{systematic concurrency testing} \mcite{Christakis2013}.
\paragraph{State Space Reductions}
Without additional state space reductions, SMC would lead to redundant
explorations of many parts of the state space.
Indeed, in parallel programs, it is common that two or more actions of
different threads are independent, and regardless of their order, they lead to
the same end state.
In this case, a stateless model checker would explore a state as many times as
is the number of paths from the initial state to this state (in the worst case
the number of paths to a given state can be exponential to its distance from
the initial state).
To mitigate this problem, \emph{dynamic partial order reduction}
(\emph{DPOR}) \mcite{Flanagan2005dpor} is often employed with SMC.
DPOR is a version of partial order reduction that tightly integrates with
the SMC exploration algorithm and keeps track of parts of the state space which
still need to be explored.
Using DPOR, SMC can avoid redundant exploration of equivalent paths in the
state space.
Many works are concerned with the design of efficient DPOR methods both for
parallel programs running under sequential consistency (interleaving
semantics) and for various relaxed memory models.
It is usually accomplished by a combination of two aspects: an equivalence of
traces and an exploration algorithm which ensures at least one trace (and in
optimal case exactly one trace) from each equivalence class is explored.
The trace equivalence has to be designed in such a way that for each of its
classes either all traces contain only safe states, or all traces contain an
unsafe state -- i.e., the equivalence preserves safety properties.
Over the years, multiple DPOR techniques were introduced
\mcite{Flanagan2005dpor,Sen2007,Tasharofi2012,Saarikivi2012,Abdulla2014,Zhang2015,Chalupa2017,Abdulla2018,Aronis2018,Abdulla2019}.
Among these techniques, \mcite{Abdulla2014} is interesting since it provides an optimal algorithm for equivalence based on \emph{Mazurkiewicz traces} \mcite{Mazurkiewicz1987} (where traces are considered equivalent if one of them can be obtained from the other by swapping adjacent non-conflicting execution steps).
The optimal DPOR presented by \mcite{Abdulla2014} is optimal for the
given trace equivalence (i.e., it explores exactly one execution in each equivalence class of the used trace equivalence).
In recent years, several works have explored coarser equivalences and optimal algorithms for them.
In \mcite{Chalupa2017}, the authors use a notion of observation of write operations and consider traces to be equivalent if the corresponding reads observe the same writes in both traces.
However, their algorithm is optimal only for programs in which the graph of inter-thread communication is acyclic.
A similar notion of observation is used in \mcite{Aronis2018}, in this case to only consider write events as interfering if at least one of them can be observed later.
In \mcite{Abdulla2018} the authors use a notion of reads-from equivalence (i.e., trace equivalence based on program order and reads-from relation which connects reads to writes which produced the read value) for analysis of a fragment of the C11 memory model that contains only release and acquire memory orders.
The advantage of this trace equivalence is that it does not distinguish traces which differ in the order of unobserved writes.
The authors argue that working with this equivalence is easier for release-acquire ordering than for sequential consistency due to the complexity of checking if a reads-from relation can correspond to a run of a program.
The same trace equivalence is used in \mcite{Abdulla2019} for sequentially-consistent programs.
To avoid the need for expensive (NP-complete) checks for consistency of reads-from relations, the authors use two incomplete but polynomial algorithms.
One of them can show that given relation is consistent and one which can show it is inconsistent, before running the expensive check which is exponential in the number of threads.
A somewhat different modification of DPOR is presented in \mcite{Albert2017cspor}.
It uses a notion of context-sensitive independence to improve on previous DPOR techniques.
In essence, it considers two actions $a$ and $b$ independent in a given state if they can be executed in both as $ab$ and $ba$ and both executions lead to the same final state.
This technique uses a (local) state comparison and therefore is not entirely stateless.
This approach was later combined with observers to achieve further reduction improvements \mcite{Albert2019}.
Another reduction approach based on observation of read and written values is
\emph{Maximal Causality Reduction} (\emph{MCR}) \mcite{Huang2015}, which can be
seen as an alternative to DPOR.
MCR employs an SMT solver to find new traces to explore, which allows it to
explore fewer interleavings then Mazurkiewicz-trace-based DPOR techniques.
Each time a new trace is found, it is guaranteed that at least one read will read a different value than read on already observed traces.
An advantage of MCR is that it can be easily modified for parallel
exploration (DPOR cannot be easily executed in parallel).
The MCR was also applied to the TSO and PSO memory models \mcite{Huang2016}.
\paragraph{Data Nondeterminism}
Most works concerning stateless model checking with DPOR expect that thread
scheduling is the only source of nondeterminism in the system \mcite{Flanagan2005dpor}.
However, there are combinations of SMC with other techniques that can handle
nondeterministic data.
In \mcite{Saarikivi2012}, a combination of DPOR with concolic execution is presented.
Another combination of concolic execution with SMC is presented in \mcite{Sen2007}, this one uses a different approach to DPOR then \mcite{Flanagan2005dpor}.
\paragraph{Tools and Techniques}
VeriSoft \mcite{Godefroid1997,Godefroid2005verisoft} is the pioneering tool of stateless model checking.
It aims at safety verification of realistic parallel programs, primarily in C and C++, but also in other programming languages -- VeriSoft works with compiled executable programs and uses a custom scheduler to explore (bounded) runs of these programs.
To limit the re-exploration of states and state-space size, VeriSoft allows scheduling only on visible operations and uses partial order reduction.
jCUTE \mcite{Sen2007} is a tool that combines concolic execution and stateless model checking for Java programs, and therefore can handle both parallelism and data nondeterminism.
To explore all possible behaviours of parallel programs, jCUTE detects data races (concurrent access of two threads to the same location, at least one of which is a write or a lock operation) and rearranges schedules which led to them.
% - assumes locks are disjoint from R/W memory locations
CHESS \mcite{Musuvathi2008} uses a custom scheduler to drive execution of compiled executable programs written mainly in C, C++ and .NET languages such as C\#.
It uses binary instrumentation to control programs' scheduling and record order of events.
To limit state space explosion, it explores runs with fewer context switches first and uses a cache of happens-before graphs of events to avoid redundant explorations (and therefore is not entirely stateless).
To handle the environment of the operating system better, CHESS can record and relay environmental values such as current time, process identifiers and output of the platform's random number generators.
However, the user must ensure that other parts for the environment (e.g., file system, network) are used such that the program executes the same way if the same thread scheduling is repeated.
The authors argue that this approach is practical for most unit tests.
LCT \mcite{Saarikivi2012,Kahkonen2013} is another tool which combines concolic execution and SMC to handle parallel Java programs with data nondeterminism.
It uses program instrumentation to add symbolic operations where needed and allows scheduling to happen only on visible actions.
To speed up the analysis, LCT can run multiple executions of the program with different inputs in parallel and distribute them over the network.
SATCheck \mcite{Demsky2015} is a stateless model checker for C programs with support for sequential consistency and TSO.
It records dependencies between program events and uses SAT solver to reorder the events on a concrete run to get new interleavings with new behaviour.
rInspect \mcite{Zhang2015} is an LLVM-based stateless model checker for C
programs running under the TSO and PSO relaxed memory models.
It uses store buffers and shadow threads that flush values from store buffers
to the main memory.
Store buffers can be optionally bounded.
The shadow thread approach allows it to use existing DPOR techniques designed
for sequential consistency.
The maximal causality reduction (MCR) is implemented in an unnamed tool for analysis of Java programs \mcite{Huang2015,Huang2016}.
It supports both sequential concurrency and the TSO and PSO relaxed memory models.
CDSChecker \mcite{Norris2016} is a stateless model checker for C and C++ with support for the C11/C++11 memory model (with the exception of release-consume synchronisation and out-of-thin-air values which are discouraged by the standard, but the C11/C++11 memory models allow them).
It can simulate load speculation and delays by forwarding stored values to previous loads and validating this speculation.
CDSChecker primarily aims to help with unit testing concurrent data structures.
Concuerror \mcite{Christakis2013,Abdulla2014,Aronis2018} is an SMC for Erlang programs.
In Erlang, the primary way of communication between processes is message passing.
Concuerror uses context bounding and contains an implementation of the optimal DPOR for Mazurkiewicz traces and optimal DPOR based on observers.
Nidhugg \mcite{Abdulla2016,Abdulla2017tso,Aronis2018,Abdulla2019}, is a
stateless model checker that focuses on C programs running under relaxed memory
models.
It has support for sequential consistency, TSO, PSO, POWER, and partially ARM
memory models.
For TSO and PSO, Nidhugg uses the optimal DPOR algorithm from \mcite{Abdulla2014} to explore exactly one execution from each equivalence class given by \emph{chronological traces} which capture dependencies between memory operations.
The combination of chronological traces and optimal DPOR algorithm means that
for programs that do not exhibit relaxed behaviour under TSO/PSO, Nidhugg
explores the same number of traces under TSO/PSO as it would explore under
sequential consistency.
For POWER, Nidhugg uses a different algorithm which can perform a redundant exploration of incomplete traces but does not generate redundant complete traces (for the trace equivalence given by Shasha-Snir traces \mcite{Shasha1988}).
Nidhugg also has an option to use a \emph{reads-from}-based trace equivalence for sequential consistency, together with an optimal exploration algorithm for this equivalence.
This equivalence does not distinguish traces that differ only in the order of conflicting writes and therefore can be exponentially more coarse.
The corresponding algorithm needs to decide if a given read-from relation is consistent, which is an NP-complete problem.
However, the authors show that the expensive check can be avoided in most cases by use of faster but incomplete checks.
Nidhugg uses LLVM to avoid the cost of direct analysis of C programs.
% \mcite{Abdulla2019}
% - coarser equivalence for SC
% - efficient in practice
% - uses equivalence based on seqeunces of events for each thread and reads-from relation between them
% - checking consistency of a given reads-from relation is NP-complete -- the exploration algorithm first uses two incomplete but faster checks (to detect inconsistency and consistency)
% - on all programs they tesed the incomplete checks were sufficient
RCMC \mcite{Kokologiannakis2017} is a stateless model checker for C and C++ programs running under the RC11 memory model \mcite{Lahav2017} (which is an attempt to formalise and fix the C11 memory model).
Instead of exploring interleavings, the tool uses execution graphs which represent visible program actions and dependencies between them.
Tracer \mcite{Abdulla2018} is a tool for analysis of the release-acquire fragment of the C11/C++11 relaxed memory model.
It can work with C and C++ programs that do not use any other atomic orderings than release and acquire, i.e., it has no support for sequentially consistent or relaxed atomics.
Tracer does not explore all possible orderings of conflicting writes; instead, the equivalence relation at the core of their DPOR implementation is given only by program orders and read-from to avoid redundancy.
The proposed exploration algorithm is optimal for traces defined on this equivalence.
In practice, this means that nondeterministic decisions take place at read operations and writes are only recorded, and backtracking is used for cases when a previous read might read from a later executed write.
% - C/C++11 -- replaces implementations of atomic, threads, mutex to drive execution
\section{Symbolic and Concolic Execution}
Symbolic execution was introduced in \mcite{King1976}.
It executes the program with symbolic values instead of concrete once and tracks the relations between these symbolic values.
When a symbolic value is used in a conditional branch, the symbolic executor can follow one or both of the branches based on the concrete values which are permitted by the symbolic values.
If both of the branches are followed, the corresponding branching condition (or its negation) is added to the \emph{path condition}.
Therefore, the path condition collects the constraints required to get to the given code location.
Symbolic executors usually use SAT or SMT solvers to decide which paths can be taken.\mnote{SAT solvers are used for propositional formulas while SMT solvers
can solve first-order formulas that use some theory, for example, the theory of
bitvectors which can be used to precisely model computer integers. The
advantage of SMT solvers is that the encoding is simpler and the solver
can make use of the additional information encoded in the richer theory.}
Unlike previously discussed techniques, symbolic execution mostly targets sequential programs with inputs, i.e., it deals primarily with data nondeterminism and not scheduling nondeterminism.
For this thesis, symbolic execution is notable because symbolic executors often have good support for real-world programs, including programs that use libraries and interact with their environment.
The state space explored by a symbolic executor forms a tree with branches at conditions.
This \emph{symbolic execution tree} can be quite large even for small sequential programs with loops or recursion -- this is the problem known as \emph{path explosion}.\begin{marginnote}%
Consider this code:
\medskip
\begin{cppcode}
unsigned x = input();
while (x > 0) {
do_work(x);
x--;
}
\end{cppcode}
\smallskip
Its symbolic execution tree has as many branches as there are values in the \cpp{unsigned} data type:
\begin{tikzpicture}[node distance=2em,
diag/.style={node distance=3.5em},
al/.style={above left = -0.5em},
ar/.style={above right = -0.5em},
]
\node (i) {$\texttt{x} = \alpha$};
\node[below of = i] (d) {\cpp{if (x > 0)}};
\node[below left of = d, diag] (dl) {$\square$};
\node[below = -0.2em of dl] {\tiny $[\alpha = 0]$};
\node[below right of = d, diag] (dr) {\texttt{do\_work(x);}};
\node[below of = dr] (drd) {\texttt{x-{}-;}};
\node[below of = drd] (drdd) {\cpp{if (x > 0)}};
\node[below left of = drdd, diag] (drddl) {$\square$};
\node[below = -0.2em of drddl] {\tiny $[\alpha = 1]$};
\node[below right of = drdd, diag] (drddr) {\texttt{do\_work(x);}};
\node[below of = drddr] (drddrd) {\texttt{x-{}-;}};
\node[below of = drddrd] (drddrdd) {\cpp{if (x > 0)}};
\node[below left of = drddrdd, diag] (drddrddl) {$\square$};
\node[below = -0.2em of drddrddl] {\tiny $[\alpha = 2]$};
\node[below right of = drddrdd, diag] (drddrddr) {$\vdots$};
\path
(i) edge (d)
(d) edge node[al]{\tiny F} (dl)
(d) edge node[ar]{\tiny T} (dr)
(dr) edge (drd)
(drd) edge (drdd)
(drdd) edge node[al]{\tiny F} (drddl)
(drdd) edge node[ar]{\tiny T} (drddr)
(drddr) edge (drddrd)
(drddrd) edge (drddrdd)
(drddrdd) edge node[al]{\tiny F} (drddrddl)
(drddrdd) edge node[ar]{\tiny T} (drddrddr)
;
\end{tikzpicture}
\end{marginnote}
The symbolic execution tree can be explored in different manners (e.g., depth-first search, random search) depending on the objective of the analysis, for example, if it is desirable to explore all the behaviours of the program, or if some representative sample should be analysed for the purpose of testing.
\paragraph{Test Generation}
Symbolic execution can be used to generate concrete test cases for a program (or a function) -- each time the executor finishes a path (either the program terminates, or an error is found), it can use a solver to generate concrete inputs that reproduce this path in a test case \mcite{Cadar2008}.
This way, it is possible to generate tests which cover the program well and do not execute the same path repeatedly.
Even if the symbolic execution tree is large or infinite, the test generation can use random search or some other heuristics to provide good coverage of the code and therefore improve on manually-written tests.
Furthermore, tests generated from symbolic execution can be very useful for bug fixing as the programmer can use their debugging tools of choice on the generated test and do not need to learn a verifier-specific way to analyse the counterexample.
\paragraph{Concolic Execution}
\emph{Concolic Execution}, also knows \emph{Concolic Testing}, \emph{Dynamic Symbolic Execution} or \emph{Directed Automated Random Testing} is a modification of symbolic execution which can eliminate some of the expensive solver queries issued during exploration \mcite{Godefroid2005,Sen2005}.
With concolic execution, the program is executed with concrete values and it follows a single run, but the concolic executor records the path condition as with conventional symbolic execution.
When the run finishes, the executor uses the path condition and a solver to find values of input variables that will lead the program to follow a different path.
This is repeated until no more paths can be discovered, or until the given coverage criteria are met.
\paragraph{Tools}
CUTE \mcite{Sen2005,Majumdar2007} is a concolic test generator for sequential C programs that specializes at dynamic memory.
It can generate input values for pointers (e.g., linked list nodes) as well as scalars.
CUTE uses approximate constraints for pointers to prevent undecidable theories (pointer constraints can contain pointer (in)equality and check for \texttt{NULL} pointers).
It detects assertion violations, memory access errors, and certain cases of infinite loops.
Furthermore, CUTE has support for \emph{hybrid concolic testing}, which combines concolic and random testing.
It targets reactive programs and primarily uses random-generated inputs.
In this mode, concolic input generation is used to avoid getting stuck (i.e., the tool will switch to concolic input generation if it fails to add to coverage in a certain number of randomized tests).
KLEE \mcite{Cadar2008} is a symbolic executor and test generator that targets realistic C and C++ programs that communicate with the environment.
Its goal is to work with unmodified programs (with source code available, the program needs to be compiled to LLVM).
KLEE can check for memory safety and assertion safety violations.
It uses array theory with a separate array for each memory object to facilitate analysis of programs with dynamic memory.
In KLEE, symbolic pointers are handled by explicit enumeration of all possibilities when the pointer is dereferenced.
To achieve good coverage even when time and the size of the symbolic execution tree do not allow for full exploration, KLEE uses path selection heuristics.
KLEE is notable for its support of realistic programs that communicate with their environment.
It has models of some of the POSIX system calls (mostly related to the file system) and these models can contain symbolic values (e.g., in files).
The system call models are written in C and can be modified without modification of KLEE itself.
Furthermore, the program running in KLEE can also access the real file system, and KLEE can perform fault injection into system calls.
When an error is encountered, KLEE generates a test harness which creates the corresponding files and fills them with values which reproduce the problem.
% (and simulate system failures by replacing appropriate system calls)
To support the C standard library, KLEE uses uClibc standard library implementation, which is linked to the program under analysis.
Symbiotic~\mcite{Chalupa2018,Chalupa2020} is a tool for analysis of sequential C programs which uses KLEE symbolic executor as a backend.
It can detect assertion violations, memory safety errors, memory leaks, and integral overflows.
Symbiotic uses instrumentation to add pointer checks where needed and uses various analysis techniques (e.g., pointer analysis and shape analysis based on Predator \mcite{Dudka2013}) to limit the number of inserted checks.
% - pointer analysis has to account for deallocation
Furthermore, Symbiotic uses program slicing to remove code irrelevant for the property.
Symbiotic has basic support for the detection of termination and nontermination.
It can detect simple cycles in the program's state space and therefore in some cases decide the program does not terminate (for non-nested loops that preserve values of all variables).
It can also conclude the program terminates if all paths were explored.
Pinaka \mcite{Chaudhary2019} is a symbolic executor for sequential C programs which uses incremental solver to check path feasibility.
It can run either in a fully incremental mode where it reuses one solver instance for the whole program, which can cause big formulas or in a partially incremental mode which creates a new instance of the solver on backtracking.
Pinaka may not terminate on nonterminating programs, and it has an option for loop unwinding limit.
It can be used to show termination -- if a program is found to be safe without any unwinding limit, then all its paths terminate.
JDart \mcite{Mues2020,Luckow2016} is a concolic executor and test generator for sequential Java programs build on Java Pathfinder.
It can detect assertion violations and uncaught exceptions.
JDart attempts to find small values for symbolic variables, which can make loops depending on symbolic values shorter.
Java Ranger \mcite{Sharma2020} is a symbolic executor for Java based on Symbolic Pathfinder.
It can merge paths using summarization of regions with multiple paths (e.g., branches of an \texttt{if} statement). %(\TODO{veritesting})
COASTAL \mcite{Visser2020} combines concolic execution with fuzz testing for Java programs.
% - uses instrumentation
It uses fuzz testing for fast exploration; the fuzzing mode only tracks the direction of branches.
Concolic execution can be used to get to parts of the program's state space which are harder to reach.
COASTAL can explore different parts of the symbolic execution tree in parallel (with different fuzzer or concolic executor instances).
To help with the analysis of realistic programs, COASTAL has models for some of the standard Java classes. % but seems to be able to handle original java bytecode libraries (?)
% - no symbolic arrays
Map2Check \mcite{Rocha2020} is a tool which combines fuzzing, symbolic execution, inductive invariants for safety checking for sequential C programs.
It is LLVM based and uses Klee as a backend symbolic executor.
The idea of this combination is that fuzzing is used to find shallow bugs, while symbolic execution finds deep bugs.
Map2Check runs multiple fuzzers in parallel.
\section{Bounded Model Checking \& Other Symbolic Techniques}\label{sec:stateoftheart:bmc}
Bounded model checking (BMC) was introduced in \mcite{Biere1999} as an alternative to
symbolic model checking techniques which used binary decision diagrams.
BMC encodes program runs of fixed length into propositional formulas, or to
formulas over some first-order logic.
It then relies on SAT or SMT solvers to decide whether the formula is satisfiable
or not.
One of the advantages of BMC is that it naturally handles data nondeterminism and can cope with scheduling nondeterminism too.
The original paper introducing BNC presented a verification procedure for LTL
(which detected loops in the program runs by checking equality of the last
state of a bounded run to some of its previous states).
However, for realistic programs, the focus is mainly on safety properties, and
more recently also on (non)termination.\mnote{For tools which use approximation
or heuristics, checking for termination is not the same as checking for
nontermination as the tool might simply return \emph{unknown} as a result if it
fails to prove its objective (i.e., termination or nontermination).}
BMC tools usually build formula from an SSA form\mnote{Static single assignment; each variable is assigned only at one place in the source code.} of the program with a bounded number $n$ of loop iterations and recursion depth \mcite{Clarke2004}.
The loops and recursive functions are unwound -- repeated $n$ times with guard before each repetition which allows the program to skip the repetition if the number of iterations was lower then $n$.
Furthermore, \emph{unwinding assertions} are inserted after the last repetition to ensure the unwinding was sufficient.
The result is that all jump lead forward, which simplifies the creation of the logic formula which describes the transition function of the program.
\paragraph{Parallelism}
We have categorized the existing approaches to parallelism in BMC into thee broad categories.
The conceptually simplest is to explicitly enumerate symbolic context-switch
bounded runs of the program and then encode them to resolve symbolic data.
This approach is used by ESBMC \mcite{Cordeiro2011} and by
\mcite{Rabinovitz2005} for an early concurrent extension of CBMC.
Another approach is to encode the control flow of threads separately and add scheduling constraints which specify inter-thread ordering.
This approach is used by CBMC \mcite{Alglave2013po} and also proposed in \mcite{Ganai2008}.
The last approach is sequentialization, which makes use of a BMC for sequential programs and a preprocessing step that converts a parallel program into a nondeterministic sequential program.
Many sequentialization schemes were proposed, differing both in the size of the encoding and the way they limit thread interaction (e.g., limit to the number of context switches, to the number of shared-variable interactions) \mcite{Qadeer2004,Lal2009,Inverso2015,Tomasco2016,Tomasco2017}.
\paragraph{Path Bounding}
The main limitation of BMC is that it explores only runs up to some length
bound $k$.
Therefore, BMC alone cannot prove correctness unless it shows that the loop
bound was sufficient (for example, by showing unwinding assertions are not
violated).
To mitigate this limitation, BMC can be combined with other techniques which
aim to prove correctness of the program, such as $k$-induction or
IC3.
\subparagraph{$k$-Induction}
$k$-induction \mcite{Donaldson2011,Gadelha2017} is an extension of
bounded model checking which allows it to find bugs faster and to prove
correctness of programs without unwinding loops fully.
Multiple $k$-inductions schemes exist, in the one proposed in~\mcite{Gadelha2017} a verifier with $k$-induction repeatedly performs following three steps with an increasing bound $k$.
\begin{itemize}
\item \emph{Base Case} checks if an error is reachable in $k$ steps.
\item \emph{Forward Condition} checks if all program's paths had terminated in
$k$ steps (i.e., the tool had already explored all states, and therefore
the verification is done).
\item \emph{Inductive Step} checks a formula corresponding to the following
proposition: if the property holds for the first $k$ steps, it also
holds for $k+1$ steps.
This step is checked in a modified program in which all loop variables
are initialised to nondeterministic values before the loops, and the
loops are required to execute to completion.\mnote{This can be achieved
by inserting \texttt{assume} statements that ensure that the loop body
is entered and that after the given number of iterations the condition
for loop termination holds.}
\end{itemize}
One of the problems with $k$-induction is that the induction argument may not be strong enough to prove the inductive step.
For this reason, the induction can be strengthened by invariants derived from the original program (i.e., the invariant constraints the nondeterministic values of loop variables) \mcite{Rocha2017}.
\subparagraph{IC3/PDR}
Another symbolic technique that uses solvers is \emph{IC3} (\emph{Incremental Construction of Inductive Clauses for Indubitable Correctness}), also known as \emph{Property Directed Reachability} (\emph{PDR}).
It verifies a system without unrolling its transition relation and uses a lot of relatively small solver queries (while BMC usually encodes the whole program into one big query).
IC3 iteratively builds approximations of reachable safe state space and refines them once it finds a step that leads from the set of already discovered states to an unsafe state.
The refinement works backwards from the predecessor of the unsafe state.
If the refinement reaches the initial state, the analysed system is incorrect.
Solver queries in IC3 concern separate steps in the state space (not entire paths).
IC3 was initially introduced in the context of hardware (circuit) verification \mcite{Bradley2011} and refined later as \emph{property directed reachability} (\emph{PDR}) \mcite{Een2011}.
It was later extended to software using SMT solvers \mcite{Cimatti2012}.
The SMT-based IC3 can also handle software with infinite state space.
Software IC3 was also combined with predicate abstraction in multiple ways including \emph{Implicit Predicate Abstraction} \mcite{Cimatti2014} and \emph{Counterexample to Induction-Guided Abstraction-Refinement} (\emph{CTIGAR}) \mcite{Birgmeier2014}.
IC3 was also combined with $k$-induction to strengthen the inductive step of $k$-induction \mcite{Beyer2020pdr}.
\paragraph{Tools}
The original bounded model checker BMC \mcite{Biere1999} was an LTL bounded
model checker that targeted a modelling language.
F-Soft \mcite{Ivancic2005} is a tool for safety analysis of sequential C programs.
It supports both BMC and BDD-based model checking.\mnote{BDDs, or Binary Decision Diagrams are used in symbolic model checking methods. However, for software analysis these methods are superseded by bounded model checking and not used these days.}
It uses program slicing and predicate abstraction to improve analysis performance.
An interesting property of F-Soft is that it can generate executable counterexamples.
CBMC \mcite{Clarke2004,Kroening2014} is a widely used SAT-based bounded
model checker for sequential and parallel C programs.
It uses bit-precise encoding and therefore preserves the semantics of computer
integers.
CBMC can detect safety errors, and it uses instrumentation for detection of
errors other than assertions.
For efficient analysis of parallel programs, CBMC uses encoding based on
partial orders described in \mcite{Alglave2013po} -- it builds the formula from
the SSA form of each thread and ordering constraints for shared variables.
CBMC is notable for its support of relaxed memory models, including \xtso,
PSO and partially the POWER memory model, as well as sequential consistency
\mcite{Alglave2013po}.
CBMC uses a custom parser for C.
ESBMC \mcite{Gadelha2018,Gadelha2019}, is another widely used BMC for C
programs.
It can detect assertion violations, memory errors, overflows and mutex-caused
deadlocks.
ESBMC was derived from CBMC, but it uses SMT for the encoding of the program and has
support for $k$-induction with invariant generation (based on interval analysis
of integral variables).
The invariant generation is currently unsound for programs with pointers.
The support for concurrency in ESBMC is described in \mcite{Cordeiro2011}.
ESBMC explores possible program interleavings explicitly and
collects constraints on nondeterministic variables, which are then
used to generate a formula.
The formula can be generated and solved either for each run separately (lazy
approach, has the advantage of being incremental), or all interleavings can be
encoded into one formula that uses guards to encode scheduling of each
interleaving (schedule recording).
There is also an encoding based on under-approximation and widening that
attempts to produce more compact encoding.
The approximation uses proofs of unsatisfiability of the formula for refinement.
According to \mcite{Cordeiro2011}, the lazy approach seems to work best.
Regardless of the formula encoding, ESBMC uses a bound on the number of context
switches.
To limit the number of runs which need to be explored, ESBMC allows
context switches only at visible instructions (i.e., instructions which access
shared memory).
The handling of parallelism in ESBMC makes it closely related to explicit-state
model checkers with symbolic extensions, like Symbolic Pathfinder or DIVINE.
Similarly to CBMC, ESBMC used a custom C parser, but it had later switched to
using the clang compiler for parsing C, and there is ongoing work for using this
parser for C++.
Unlike other tools which use clang, ESBMC does not use the LLVM intermediate
representation but instead builds its internal representation based on the
C/C++ AST produced by clang.
Furthermore, ESBMC has derivatives which aim at C++ verification.
ESBMC++ \mcite{Ramalho2013} uses a model of parts of the standard C++ library,
has support for C++ exceptions and inheritance.
$\text{ESBMC}^{\textit{Qt}OM}$ \mcite{Sousa2015,Garcia2016} adds a model
of parts of the Qt framework for C++ programs.
In both cases, these models approximate behaviour of the respective libraries
based on the standard or documentation.
JBMC \mcite{Cordeiro2019,Cordeiro2018}, is a bounded model checker for
Java built on the same basis as CBMC.
It uses a combination of SAT and SMT solving with a dedicated solver for string
operations.
JBMC has support for exceptions (by lowering them to jumps in its intermediate
representation) and has an exact verification-friendly model of most common
parts of the Java standard libraries.
It can detect assertions, memory errors, integral overflows and type casting
errors.
JBMC has currently no support for threads, and it also lacks support for the
Java Native Interface (JNI) and reflection.
LLBMC \mcite{Merz2012,Falke2013} is an SMT-based, bit-precise BMC for
sequential C and C++ programs that is notable for its use of the LLVM
intermediate representation which allows it to reuse existing C and C++
compilers.
It is also one of the few tools which claim support of C++.
However, the C++ support has some limitations, most notably, there is no
support for exceptions and run-time type support.
Furthermore, LLBMC has no support for threads and floating-point arithmetic in
both C and C++.
It can detect assertion violations, integer overflows, invalid shifts, memory
errors and memory leaks.
% \TODO{unmaintained}
Another LLVM-based tool is LLVMVF \mcite{Sousa2013}, a generic verification framework on which a BMC for parallel programs is built.
Concurrency support in LLVMVF is incomplete; for example, it lacks support for condition variables.
% \TODO{unmaintained}
An older version of Map2Check \mcite{Rocha2015} was a test generator for memory safety and leak detection unit tests for C programs.
It uses ESBMC to extract verification conditions (conditions of memory safety violation).
Then it instruments the program to convert memory safety to assertions about pointers and runs the tests with concrete data.
Yogar-CBMC \mcite{Yin2018,Yin2019} is a derivative of CBMC that uses
abstraction and refinement for the encoding of scheduling constraints in
parallel programs to make the formulas smaller.
Unlike CBMC, it has no support for relaxed memory.
It can run multiple counterexample-guided refinement loops in parallel, and
they share the learned scheduling constraints to analyse the program faster.
Yogar-CBMC was able to solve all concurrency benchmarks in SV-COMP 2019 and significantly outperformed CBMC.
CBMC-Path~\mcite{Khazem2019} is another derivative of CBMC.
It encodes paths in
the state space one by one instead of building a formula for the whole
(bounded) state space of a parallel program.
While it was designed to facilitate faster bug discovery for SV-COMP, it performed significantly worse than CBMC.
DepthK \mcite{Rocha2017,Rocha2017svc}, is a tool based on ESBMC with the
addition of polyhedral invariants which aim to strengthen inductive step in the
$k$-induction.
It was presented before ESBMC gained the ability to compute invariants itself, and it uses an external invariant generator.
DepthK can analyse concurrent C programs.
Dartagnan~\mcite{PonceDeLeon2020,Gavrilenko2019} is a BMC for analysis of parallel C programs under relaxed memory models.
In this case, the relaxed memory model is specified on input, not encoded in the verifier.
To achieve this, Dartagnan uses symbolic encoding of relations between the program's actions (data dependencies, reads-from, program order, …) and the memory model restricts the possible runs based on these relations.
It should be possible to use Dartagnan with a wide range of memory models including \xtso, ARMv8, POWER, C/C++11 and Linux kernel memory model.
To speed up the analysis, it uses relation analysis that reduces the size of the encoding -- it determines which pairs of events may influence constraints specified by the memory model.
Dartagnan internally uses Boogie intermediate language and translates C to it using SMACK \mcite{Rakamaric2014}; it lacks support for pointer arithmetic.
Lazy-CSeq~\mcite{Inverso2015,Nguyen2017,Inverso2020} is a se\-quen\-tial\-isa\-tion-based
tool for analysis of concurrent C programs which uses bounded model checker as
a backend (CBMC by default).
It performs source-to-source transformation from a concurrent C program to a
sequential C program.
The sequential program simulates a bounded number of round-robin scheduling
rounds (i.e., the number of context switches is limited, and the threads are
required to execute in a fixed order, with the possibility to perform no steps
in a given round).
This way, Lazy-CSeq can simulate a bounded number of interleavings with small
memory overhead while limiting added nondeterminism.
It can detect errors detectable by the used backend and POSIX mutex deadlocks.
To speed up analysis, Lazy-CSeq uses Frama-C \mcite{Canet2009} to
infer bounds of integral variables and shrink the corresponding bitvector
formulas.
Lazy-CSeq performs \emph{lazy sequentialisation}, i.e., it explores only the reachable state space of the program, unlike some older works on sequentialisation like \mcite{Lal2009}.
Lately, support for parallelisation by partitioning the set of execution traces into independent instances was added to Lazy-CSeq.
% - the formula size if O(|orig prog.| * |n threads| * |n rounds|)
MU-CSeq~\mcite{Tomasco2015,Tomasco2016} and IMU-CSeq~\mcite{Tomasco2017}
are se\-quen\-tial\-isa\-tion-based tools for analysis of concurrent C programs
that use CBMC as a backend.
They work by nondeterministically guessing a bounded sequence of shared memory
writes and then simulating the program so that its runs match this sequence.
Optionally, they can also execute unobserved memory writes which are not part of the
sequence.
The difference between the tools is that IMU-CSeq uses a separate sequence for
each memory location, while MU-CSeq has a single sequence of visible memory
operations.
IMU-CSeq also has support for relaxed memory (TSO, PSO) using \emph{Shared
Memory Abstraction}, which defines the behaviour of memory operations
and synchronisation primitives of the given memory model.
These tools perform \emph{eager sequentialisation}, i.e., threads are executed separately, with nondeterministic values for shared memory reads which connect them.
BLITZ \mcite{Cho2013} is a BMC for C programs that decomposes the verification instance using approximations of preconditions of property violation.
These approximations are gradually refined based on the proofs of unsatisfiability for the under-approximated instances.
It aims to detect assertion violations and memory safety in larger programs (100kloc).
BLITZ appears to have no support for parallelism.
TCBMC \mcite{Rabinovitz2005} is an early extension of CBMC to parallel programs.
It uses context switch bounding, and the authors proposed mutex deadlock detection and race detection for it.
However, the implementation supported only two threads without deadlock detection.
CheckFence~\mcite{Burckhardt2007} is a tool for checking implementations of concurrent data structures in C against a relaxed memory model which is an over-approximation of many hardware memory models (i.e., safety in CheckFence should imply safety under these memory models).
It uses a test program as a specification -- it checks if the set of all executions under the given memory model is a subset of a set of \emph{serial executions} which are sequentially consistent executions in which each operation with the data structure is atomic.
Therefore, CheckFence does not detect properties such as assertion violation or memory safety but considers the program unsafe it if can exhibit new behaviour under its relaxed memory model.
CheckFence encodes threads and scheduling separately.
NBIS \mcite{Gunther2014} is an incremental BMC for sequential C programs.
In this case, incremental means that it can change bounds without throwing away the formula and re-running the solver (provided the solver can support incremental solving, which is common).
NBIS uses LLVM IR to facilitate analysis of C programs.
% - not maintained
VVT \mcite{Gunther2016,Gunther2017} is a successor to NBIS which combines bounded model checking for fast bug discovery with CTIGAR, an SMT-based version of IC3 with counterexample refinement.
It aims at the analysis of parallel C software and uses LLVM IR internally.
VVT can analyse infinite-state systems (with a finite number of threads and with arrays with statically known bounds).
To improve performance with parallel programs, VVT uses dynamic state-space reduction based on conditionally atomic blocks of code (which are instrumented into the code).
VVT is not bit-precise, it uses integer arithmetics instead of arithmetics with overflows and therefore cannot find problems caused by bounded bitwidth of computer integers.
% - not maintained
power2sc \mcite{Abdulla2017} is a tool for analysis of C programs under the POWER memory model which uses CBMC as a backend.
It uses program transformation to build a new program which, when executed under sequential consistency, simulates context-switch-bounded runs of the original program under POWER.
The transformed program relies on nondeterminism to guess the results of interactions between threads and then checks the validity of these guesses.
While the presented tool uses CBMC, the authors claim that their program transformation can work with any verifier for safety-checking parallel program under sequential consistency.
% \mcite{Donaldson2011}: \textsc{K-Inductor}, \textsc{K-Boogie}
% <- CBMC based
% -> extension of Boogie
% - \TODO{...}
\section{Other Approaches to Program Analysis}
There are many program analysis techniques relevant to realistic parallel programs which were not covered in the previous sections.
These techniques are often very different, and categorising them does not seem to provide many benefits.
Therefore, we will now outline these techniques as they are represented by notable tools that implement them.
\paragraph{Tools}
Ultimate Automizer \mcite{Heizmann2017,Heizmann2013} is a program analysis tool for sequential and parallel C programs.
It uses \emph{automata-theoretic verification approach} which, in essence, converts a program to an automaton (starting with an automaton derived from the control-flow graph), and then checks reachability of error states in this automaton and possibly refines it.
The automata are constructed in such a way that the set of traces of the automaton over-approximates program traces.
Therefore, if the automaton cannot reach an error location, the program is correct.
If the automaton can reach an error location, the feasibility of the error trace is checked, and if it is not feasible, the automaton is refined (otherwise an error was found).
Therefore, automata-theoretic verification is an instance of CEGAR.
The refinement is performed by the construction of automaton of spurious traces -- Floyd-Hoare automaton -- which is constructed on-demand using an SMT solver.
Furthermore, Ultimate Automizer uses nested words automata for interprocedural analysis.
It also uses Büchy automata and ranking functions for termination and nontermination analysis.
Ultimate Automizer can detect assertion violations, memory safety errors, check for integral overflows, and termination.
Ultimate Taipan \mcite{Dietsch2020,Greitschus2017} is a close relative of Ultimate Automizer that uses refinement based on \emph{path programs}.
When a possible error is found, it projects the original program into the error trace to obtain a corresponding path program that can contain loops and only contains statements found on the trace.
It then attempts to show trace infeasibility by proving unreachability of the error location in the path program using invariants.
% - newer version uses SMT formulas to represent program states
Ultimate Kojak \mcite{Nutz2015} is another related tool which uses a different algorithm for the refinement step.
This algorithm is based on proofs of unsatisfiability provided by an SMT solver used to check the feasibility of error paths.
Skink \mcite{Cassez2017} is an LLVM-based tool for assertion checking of sequential and concurrent C programs.
It uses the automata-theoretic verification approach similar to Ultimate Automizer (with automata-based refinement).
However, Skink is limited to programs which can be fully inlined, i.e., it does not support function calls.
Furthermore, it uses mathematical integers and therefore is not bit-precise.
To improve efficiency with parallel programs, Skink uses reduction based on the source-DPOR algorithm used more often in stateless model checking \mcite{Abdulla2014}.
CPAchecker \mcite{Dangl2015,Beyer2011,Beyer2020pdr} is a program analysis tool for C programs and a modular framework which aims at easy integration of new components and research of new verification ideas.
It uses a combination of abstract domains and has support for various techniques such as $k$-induction, predicate abstraction, and property directed reachability (PDR/IC3).
To avoid false counterexamples, CPAchecker performs bit-precise validation of counterexamples.
% - memory graphs for memory safety as of~\mcite{Dangl2015} (but later in SV-COMP yes)
Various tools build on the CPAchecker framework are participating in the SV-COMP.
It appears that CPAChecker without additional extensions participates in SV-COMP under the name CPA-Seq.
CPA-BAM-BnB \mcite{Andrianov2017} is a CPAchecker-based tool that uses Block Abstraction Memoization, which means it divides programs into blocks (often a block is a function) and analyses blocks separately.
It uses a cache of existing block abstractions.
Furthermore, it uses value analysis and predicate analysis with refinement on spurious counterexamples.
It also uses BnB, which is a model of memory that uses predicate analysis and uninterpreted functions to map memory locations to memory values.
In this model, memory is split into regions, and each region has its mapping function.
CPALockator \mcite{Andrianov2018} is a version of CPAchecker which specialises on parallel programs and data race detection.
CPALockator aims to provide a lightweight analysis at the cost of possibly missing errors.
To do this, it tracks locks and threads that are active at the given time.
CPALockator uses predicate analysis and CEGAR to exclude spurious race conditions.
PeSCo \mcite{Richter2019,Czech2017} is a tool build on top of CPAchecker that tries to predict efficient verification approach for a given task using machine learning.
The learning uses graph encoding of programs and decides which of the six predefined configurations of CPAchecker should run and in which order.
Corral \mcite{Lal2012} is a verifier for the Boogie modelling language accompanied by language frontends for C and .NET bytecode.
It performs bounded reachability and uses abstraction and refinement.
Abstracted programs track values of only a selected set of variables (which is initially empty).
On top of that, functions are inlined on-demand if an over-approximation of a function's effects causes a spurious error.
Corral supports parallel programs using sequentialisation.
SMACK \mcite{Rakamaric2014,Carter2016,Garzella2020} is a bug-finder for various programming languages that can be translated to LLVM.
It translates LLVM IR into Boogie intermediate verification language and then uses Boogie, Corral (default) or other Boogie-based verifiers as a backend.
This approach is intended to allow easy prototyping of backend verifiers which do not need to handle some of the complexities of programming languages.
SMACK can use either bitvectors (i.e., be bit-precise) or arbitrary precision integers.
Since Boogie has no support for dynamic memory, SMACK partitions memory into disjoint regions that are represented by maps in Boogie.
While SMACK initially targeted C, it was later extended to (partially) support various other programming languages that can be translated to LLVM: C++, Rust, Objective-C, D, Fortran, Swift, and Kotlin.
In this effort, SMACK relies heavily on LLVM-based language frontends to provide basic language support.
On top of that, SMACK has basic support for standard libraries (allocation, C string and math operations, POSIX threads, models of most common parts of the libraries).
However, the authors claim these models must be written manually, which they consider being a major undertaking.
This is shown to be a limiting factor, especially for languages with heavy runtime components, such as Swift, Objective-C and Kotlin.
% - shows limits of SMACK's handling of LLVM
% - pratialy reuse of existing library (Kotlin arithmetics)
Furthermore, thanks to compilation to a common intermediate representation (LLVM), SMACK has support for cross-language program analysis.
With a concurrency-enabled backend (e.g., Corral), Smack also supports parallel programs.
JayHorn \mcite{Kahsai2019} is a tool for analysis of sequential Java programs.
It encodes programs into Horn clauses and can handle some instances of unbounded heap data structures and unbounded execution using invariants.
JayHorn is not bit precise (it uses unbounded integers) and has only a basic model of Java library (it mostly assumes library calls return arbitrary values).
Threader \mcite{Gupta2011} is a tool for compositional verification of parallel C programs.
It uses thread-modular proofs based on proof rules from \mcite{Jones1983,Owicki1976}.
The analysis is based on abstract reachability with refinement and uses a Horn clause solver.
Threader can detect safety violations and termination properties and is not limited to specific synchronisation primitives.
Predator and PredatorHP \mcite{Peringer2020,Dudka2013} are tools for checking memory safety of sequential C programs.
Predator uses shape analysis; i.e., it describes the shape of heap memory by an abstract domain (\emph{symbolic memory graphs}).
It targets memory errors and assertions and can handle unbounded lists (both doubly- and singly-linked, circular, and nested lists).
Predator has only a limited ability to handle non-list types.
% - uses function summaries, recursion depth bound
To avoid parsing C directly, Predator uses an intermediate representation of the GCC compiler.
PredatorHP is a tool which runs several configurations of Predator in parallel (a verifier which can prove correctness but over-approximates and several bug hunters which can show incorrectness).
SatAbs \mcite{Clarke2005} is a tool for bit-precise analysis of (possibly parallel) C programs.
It uses predicate abstraction with counterexample guided refinement to decrease the size of the analysed program -- the program is abstracted into a boolean program which is analysed by a backend model checker.
If the backend model checker concludes the abstracted program is correct, then the original program is correct.
Otherwise, the trace of the abstract program is validated using SAT-queries that simulate the corresponding path of the original program.
If the counterexample is spurious, SatAbs refines the abstraction.
SatAbs uses bitvectors (bit-blasted to SAT formulas\mnote{Bit-blasting is a technique which rewrites bitvector variables to a sequence of boolean variables and implements bitvector operations by respective boolean circuits.
It allows solving (existentially-qualified) bitvector formulas with a SAT solver.}) and therefore is bit-precise.
It can perform reachability and equivalence checking of a C program and the corresponding Verilog circuit.
An unnamed tool that encodes the behaviour of relaxed memory models in a program is presented in \mcite{Alglave2013}.
Given a C program which is supposed to run under one of the supported relaxed memory model (including \xtso and a fragment of POWER) it transforms it into an equivalent program which can be analysed by an analyser for sequentially consistent parallel programs.
The authors evaluate their approach with various backend analysers, including CBMC, ESBMC, and SatAbs.
The encoding of relaxed behaviour uses an \emph{abstract event graph} that encodes events of the program and dependencies between them.
% GACAL \mcite{Quiring2020} is a tool which analyses C programs by generating potential invariants from traces and verifies them using the ACL2 theorem prover.
% It supports only a subset of sequential C (it has no support for arrays, only 32 bit integers, …). \TODO{remove?}
% VIAP~\mcite{Rajkhowa2019,Rajkhowa2017}
% - aims at programs with loops, does not need invariants
% - C programs without dynamic memory, floats, concurrency
% - uses integral arithmetic
% - translates to quantified first-order formulas over integers
% - non-bounded loops
% - can use preconditions and postconditions, or assertions
% - \TODO{remove?}
% VeriAbs~\mcite{Afzal2019}
% - portfolio verifier -- uses different independent techniques for program analysis
% - targets sequential C programs with loops
% - lightweight analysis to infer which technique should be used, based on the type of loops and data used in them
% - techniques include: random fuzz testing, array abstractions, explicit-state model checking, loop invariant generation, CEGAR, loop abstraction, loop summarization, BMC, k-induction
% - \TODO{remove?}
% VeriFuzz~\mcite{Basak2019}
% - fuzz testing with lightweight analaysis and instrumentation to extract useful infromation about the program
% - coverage driven
% - uses evolutionary algorithm to generate test inputs and learning algorithm based on decision trees to decide which techniques to use for fuzzing
% - uses loop unrolling (with increasing bounds)
% - \TODO{remove?}
\iffalse
\section{XXX}
\section{??? Termination and Temporal Logics}
- sequential LTL (inf systems, inc. termination analysis): \cite{Dietsch2015}
- other liveness in sequential C (inf sys, Terminator): \cite{Cook2007} (drivers)
- termination (seq, ptr, inf, \textsc{Terminator}): \cite{Cook2006}
- termination (inc. heap): \cite{Berdine2006}
- termination+nontermination (C, Java, Haskell, Prolog, recurstion, ptr, bitvectors (bit-precise); seq no structs): AProVe \cite{Hensel2017, Giesl2017}
T2~\cite{Brockschmidt2016}
- open source release of the Terminator
- liveness + safety