The “solver” Module#

Note

This page is under construction.

The ISLaSolver Class#

class isla.solver.ISLaSolver[source]#

The solver class for ISLa formulas/constraints. Its top-level methods are

solve()

Use to generate solutions for an ISLa constraint.

check()

Use to check if an ISLa constraint is satisfied for a given input.

parse()

Use to parse and validate an input.

repair()

Use to repair an input such that it satisfies a constraint.

mutate()

Use to mutate an input such that the result satisfies a constraint.

__init__(grammar, formula=None, structural_predicates=frozenset({StructuralPredicate(name='direct_child', arity=2, eval_fun=<function is_direct_child>), StructuralPredicate(name='consecutive', arity=2, eval_fun=<function consecutive>), StructuralPredicate(name='after', arity=2, eval_fun=<function is_after>), StructuralPredicate(name='before', arity=2, eval_fun=<function is_before>), StructuralPredicate(name='same_position', arity=2, eval_fun=<function is_same_position>), StructuralPredicate(name='nth', arity=3, eval_fun=<function is_nth>), StructuralPredicate(name='different_position', arity=2, eval_fun=<function is_different_position>), StructuralPredicate(name='inside', arity=2, eval_fun=<function in_tree>), StructuralPredicate(name='level', arity=4, eval_fun=<function level_check>)}), semantic_predicates=frozenset({SemanticPredicate(count, 3)}), max_number_free_instantiations=10, max_number_smt_instantiations=10, max_number_tree_insertion_results=5, enforce_unique_trees_in_queue=False, debug=False, cost_computer=None, timeout_seconds=None, global_fuzzer=False, predicates_unique_in_int_arg=(SemanticPredicate(count, 3), ), fuzzer_factory=<function SolverDefaults.<lambda>>, tree_insertion_methods=None, activate_unsat_support=False, grammar_unwinding_threshold=4, initial_tree=<Nothing>, enable_optimized_z3_queries=True, start_symbol=None)[source]#

The constructor of ISLaSolver accepts a large number of parameters. However, all but the first one, grammar, are optional.

The simplest way to construct an ISLa solver is by only providing it with a grammar only; it then works like a grammar fuzzer.

>>> import random
>>> random.seed(1)
>>> import string
>>> LANG_GRAMMAR = {
...     "<start>":
...         ["<stmt>"],
...     "<stmt>":
...         ["<assgn> ; <stmt>", "<assgn>"],
...     "<assgn>":
...         ["<var> := <rhs>"],
...     "<rhs>":
...         ["<var>", "<digit>"],
...     "<var>": list(string.ascii_lowercase),
...     "<digit>": list(string.digits)
... }
>>>
>>> from isla.solver import ISLaSolver
>>> solver = ISLaSolver(LANG_GRAMMAR)
>>>
>>> str(solver.solve())
'd := 9'
>>> str(solver.solve())
'v := n ; s := r'
Parameters:
  • grammar (Mapping[str, Sequence[str]] | str) – The underlying grammar; either, as a “Fuzzing Book” dictionary or in BNF syntax.

  • formula (Formula | str | None) – The formula to solve; either a string or a readily parsed formula. If no formula is given, a default true constraint is assumed, and the solver falls back to a grammar fuzzer. The number of produced solutions will then be bound by max_number_free_instantiations.

  • structural_predicates (Set[StructuralPredicate]) – Structural predicates to use when parsing a formula.

  • semantic_predicates (Set[SemanticPredicate]) – Semantic predicates to use when parsing a formula.

  • max_number_free_instantiations (int) – Number of times that nonterminals that are not bound by any formula should be expanded by a coverage-based fuzzer.

  • max_number_smt_instantiations (int) – Number of solutions of SMT formulas that should be produced.

  • max_number_tree_insertion_results (int) – The maximum number of results when solving existential quantifiers by tree insertion.

  • enforce_unique_trees_in_queue (bool) – If true, states with the same tree as an already existing tree in the queue are discarded, irrespectively of the constraint.

  • debug (bool) – If true, debug information about the evolution of states is collected, notably in the field state_tree. The root of the tree is in the field state_tree_root. The field costs stores the computed cost values for all new nodes.

  • cost_computer (CostComputer | None) – The CostComputer class for computing the cost relevant to placing states in ISLa’s queue.

  • timeout_seconds (int | None) – Number of seconds after which the solver will terminate.

  • global_fuzzer (bool) – If set to True, only one coverage-guided grammar fuzzer object is used to finish off unconstrained open derivation trees throughout the whole generation time. This may be beneficial for some targets; e.g., we experienced that CSV works significantly faster. However, the achieved k-path coverage can be lower with that setting.

  • predicates_unique_in_int_arg (Tuple[SemanticPredicate, ...]) – This is needed in certain cases for instantiating universal integer quantifiers. The supplied predicates should have exactly one integer argument, and hold for exactly one integer value once all other parameters are fixed.

  • fuzzer_factory (Callable[[Mapping[str, Sequence[str]]], GrammarFuzzer]) – Constructor of the fuzzer to use for instantiating “free” nonterminals.

  • tree_insertion_methods (int | None) – Combination of methods to use for existential quantifier elimination by tree insertion. Full selection: DIRECT_EMBEDDING & SELF_EMBEDDING & CONTEXT_ADDITION.

  • activate_unsat_support (bool) – Set to True if you assume that a formula might be unsatisfiable. This triggers additional tests for unsatisfiability that reduce input generation performance, but might ensure termination (with a negative solver result) for unsatisfiable problems for which the solver could otherwise diverge.

  • grammar_unwinding_threshold (int) – When querying the SMT solver, ISLa passes a regular expression for the syntax of the involved nonterminals. If this syntax is not regular, we unwind the respective part in the reference grammar up to a depth of grammar_unwinding_threshold. If this is too shallow, it can happen that an equation etc. cannot be solved; if it is too deep, it can negatively impact performance (and quite tremendously so).

  • initial_tree (Maybe[DerivationTree]) – An initial input tree for the queue, if the solver shall not start from the tree (<start>, None).

  • enable_optimized_z3_queries (bool) – Enables preprocessing of Z3 queries (mainly numeric problems concerning things like length). This can improve performance significantly; however, it might happen that certain problems cannot be solved anymore. In that case, this option can/should be deactivated.

  • start_symbol (str | None) – This is an alternative to initial_tree for starting with a start symbol different form <start>. If start_symbol is provided, a tree consisting of a single root node with the value of start_symbol is chosen as initial tree.

__new__(**kwargs)#

Main Functions#

ISLaSolver.solve()[source]#

Attempts to compute a solution to the given ISLa formula. Returns that solution, if any. This function can be called repeatedly to obtain more solutions until one of two exception types is raised: A StopIteration indicates that no more solution can be found; a TimeoutError is raised if a timeout occurred. After that, an exception will be raised every time.

The timeout can be controlled by the timeout_seconds constructor parameter.

Returns:

A solution for the ISLa formula passed to the isla.solver.ISLaSolver.

Return type:

DerivationTree

Below, we show the source code of solve(). The code contains sufficient inline documentation. Essentially, it realizes a transition system between Constrained Derivation Trees, i.e., pairs of constraints and (potentially open) derivation trees. In the implementation, these structures are called states. The solver takes the state on top of a queue (a Fibonacci heap) and processes it with the first applicable “transition rule,” called “elimination function” in the code. The function loops until at least one solution has been found, and only as long as there are still elements in the queue.

if self.timeout_seconds is not None and self.start_time is None:
    self.start_time = int(time.time())

while self.queue:
    self.step_cnt += 1

    # import dill as pickle
    # state_hash = 9107154106757938105
    # out_file = "/tmp/saved_debug_state"
    # if hash(self.queue[0][1]) == state_hash:
    #     with open(out_file, 'wb') as debug_state_file:
    #         pickle.dump(self, debug_state_file)
    #     print(f"Dumping state to {out_file}")
    #     exit()

    if self.timeout_seconds is not None:
        if int(time.time()) - self.start_time > self.timeout_seconds:
            self.logger.debug("TIMEOUT")
            raise TimeoutError(self.timeout_seconds)

    if self.solutions:
        solution = self.solutions.pop(0)
        self.logger.debug('Found solution "%s"', solution)
        return solution

    cost: int
    state: SolutionState
    cost, state = heapq.heappop(self.queue)

    self.current_level = state.level
    self.tree_hashes_in_queue.discard(state.tree.structural_hash())
    self.state_hashes_in_queue.discard(hash(state))

    if self.debug:
        self.current_state = state
        self.state_tree.setdefault(state, [])
    self.logger.debug(
        "Polling new state (%s, %s) (hash %d, cost %f)",
        state.constraint,
        state.tree.to_string(show_open_leaves=True, show_ids=True),
        hash(state),
        cost,
    )
    self.logger.debug("Queue length: %s", len(self.queue))

    assert not isinstance(state.constraint, language.DisjunctiveFormula)

    # Instantiate all top-level structural predicate formulas.
    state = self.instantiate_structural_predicates(state)

    # Apply the first elimination function that is applicable.
    # The later ones are ignored.
    def process_and_extend_solutions(
        result_states: List[SolutionState],
    ) -> Nothing:
        assert result_states is not None
        self.solutions.extend(self.process_new_states(result_states))
        return Nothing

    flow(
        Nothing,
        *map(
            compose(lambda f: (lambda _: f(state)), lash),
            [
                self.noop_on_false_constraint,
                self.eliminate_existential_integer_quantifiers,
                self.instantiate_universal_integer_quantifiers,
                self.match_all_universal_formulas,
                self.expand_to_match_quantifiers,
                self.eliminate_all_semantic_formulas,
                self.eliminate_all_ready_semantic_predicate_formulas,
                self.eliminate_and_match_first_existential_formula_and_expand,
                self.assert_remaining_formulas_are_lazy_binding_semantic,
                self.finish_unconstrained_trees,
                self.expand,
            ],
        ),
    ).bind(process_and_extend_solutions)

if self.solutions:
    solution = self.solutions.pop(0)
    self.logger.debug('Found solution "%s"', solution)
    return solution
else:
    self.logger.debug("UNSAT")
    raise StopIteration()
ISLaSolver.check(inp)[source]#

Evaluates whether the given derivation tree satisfies the constraint passed to the solver. Raises an UnknownResultError if this could not be evaluated (e.g., because of a solver timeout or a semantic predicate that cannot be evaluated).

Parameters:

inp (DerivationTree | str) – The input to evaluate, either readily parsed or as a string.

Returns:

A truth value.

Return type:

bool

ISLaSolver.parse(inp, nonterminal='<start>', skip_check=False, silent=False)[source]#

Parses the given input inp. Raises a SyntaxError if the input does not satisfy the grammar, a SemanticError if it does not satisfy the constraint (this is only checked if nonterminal is “<start>”), and returns the parsed DerivationTree otherwise.

Parameters:
  • inp (str) – The input to parse.

  • nonterminal (str) – The nonterminal to start parsing with, if a string corresponding to a sub-grammar shall be parsed. We don’t check semantic correctness in that case.

  • skip_check (bool) – If True, the semantic check is left out.

  • silent (bool) – If True, no error is sent to the log stream in case of a failed parse.

Returns:

A parsed DerivationTree.

Return type:

DerivationTree

ISLaSolver.repair(inp, fix_timeout_seconds=3)[source]#

Attempts to repair the given input. The input must not violate syntactic (grammar) constraints. If semantic constraints are violated, this method gradually abstracts the input and tries to turn it into a valid one. Note that intensive structural manipulations are not performed; we merely try to satisfy SMT-LIB and semantic predicate constraints.

Parameters:
  • inp (DerivationTree | str) – The input to fix.

  • fix_timeout_seconds (float) – A timeout used when calling the solver for an abstracted input. Usually, a low timeout suffices.

Returns:

A fixed input (or the original, if it was not broken) or nothing.

Return type:

Maybe[DerivationTree]

ISLaSolver.mutate(inp, min_mutations=2, max_mutations=5, fix_timeout_seconds=1)[source]#

Mutates inp such that the result satisfies the constraint.

Parameters:
  • inp (DerivationTree | str) – The input to mutate.

  • min_mutations (int) – The minimum number of mutation steps to perform.

  • max_mutations (int) – The maximum number of mutation steps to perform.

  • fix_timeout_seconds (float) – A timeout used when calling the solver for fixing an abstracted input. Usually, a low timeout suffices.

Returns:

A mutated input.

Return type:

DerivationTree