0% found this document useful (0 votes)
37 views6 pages

Bridging ANF and CNF Solvers with BOSPHORUS

Uploaded by

S D
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views6 pages

Bridging ANF and CNF Solvers with BOSPHORUS

Uploaded by

S D
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

B OSPHORUS: Bridging ANF and CNF Solvers

Davin Choo* , Mate Soos† , Kian Ming A. Chai* , and Kuldeep S. Meel†
*
Information Division, DSO National Laboratories, Singapore
† School of Computing, National University of Singapore

Abstract—Algebraic Normal Form (ANF) and Conjunctive ANF and CNF solving algorithms exploit different prop-
Normal Form (CNF) are commonly used to encode problems erties of the problem encoding. For instance, Gauss-Jordan
in Boolean algebra. ANFs are typically solved via Gröbner elimination (GJE) is a natural procedure in ANF, but not in
basis algorithms, often using more memory than is feasible;
while CNFs are solved using SAT solvers, which cannot exploit CNF; while conflict learning prunes the search tree in SAT
the algebra of polynomials naturally. We propose a paradigm solvers, but we are unaware of such learning for ANF. Despite
that bridges between ANF and CNF solving techniques: the the recent successes of GJE-enabled SAT solvers in counting
techniques are applied in an iterative manner to learn facts problems [19], [20], the use of GJE-enabled solvers is not
to augment the original problems. Experiments on over 1,100 prevalent. In this context, we ask: is there an alternative and
benchmarks arising from four different applications domains
demonstrate that learnt facts can significantly improve runtime easier way to combine ANF and CNF solving?
and enable more benchmarks to be solved. The primary contribution of this paper is an affirmative
answer to the above question. We demonstrate a paradigm
I. I NTRODUCTION that bridges between ANF and CNF solving techniques. The
Algebraic Normal Form (ANF) and Conjunctive Normal techniques are applied in an iterative manner to learn facts
Form (CNF) are two commonly used normal forms in Boolean to augment the original problems. This approach is attractive
algebra. Both ANF and CNF reason about Boolean variables when the conversion time between ANF and CNF encodings is
x1 , . . . , xn but with different Boolean operators. negligible relative to the overall solving time. Our experiments
demonstrate that our iterative approach can help us to solve
ANF is a system of polynomial equations in GF(2), i.e.,
more instances while spending less time.
the Galois field of two elements, or Z2 . Each polynomial is a
As a consequence of this bridge, problems can be encoded
sum of monomials, where a monomial is a product of zero or
in their most natural and comprehensible manner, either in
more variables. Cryptologists prefer ANF because it naturally
ANF or CNF, and yet draws from solving techniques in both to
encodes definitions such as AES [1] and hash functions [2].
achieve reasonable solving performance — this is our second
One approach to solving ANF is to compute the Gröbner
contribution. We call our tool B OSPHORUS, the namesake of
basis of the system using the Buchberger’s algorithm [3] or
the Bosphorus bridge connecting Europe and Asia.
its variants [4], [5]. Efficient implementations include M4GB
In the next section, we describe the various techniques for
[6], FGb [7] and Magma [8]. In certain systems, methods
solving ANFs and CNFs. Section III describes how B OSPHO -
such as XL/XSL [9], [10] and ElimLin [11], [12] have also
RUS uses these techniques. Results on three classes of ANF
been shown to be effective. Unfortunately, ANF solvers on
problems and the SAT Competition 2017 benchmarks are in
huge polynomial systems tend to require more memory than
section IV. For notation, we use ⊕ for exclusive-OR (XOR)
is feasible on most computing platforms [13].
and addition in GF(2), ¬ for negation, ∧ for conjunction
In comparison, CNF is a conjunction of clauses. Each clause
and ∨ for disjunction. We use the term polynomial to mean
is a disjunction of literals, where a literal is either a Boolean
polynomial equation equated to zero, and we will also write
variable or its negation. As Boolean circuits are naturally de-
such equations by just stating the polynomial.
scribed in logical connectives, hardware verification problems
are often described in CNFs [14]. Some other domains using II. L EARNING FACTS
CNFs are software verification, industrial planning, scheduling
Our approach iteratively extracts two types of learnt facts:
and recreational mathematical puzzle solving.
(1) linear equations xi1 + xi2 + · · · + xip + c where c is either
CNFs are typically solved by SAT solvers, which use
zero or one; and (2) polynomials of the form xi1 xi2 . . . xip ⊕1.
significantly less memory than the methods for ANF. This is
The former keeps the degree of the system low while the latter
primarily due to the depth-first search nature of CDCL [15]
allows immediate deduction that xi1 = xi2 = · · · = xip = 1.
that most modern SAT solvers are based on. Many solvers
The rest of this section explains how B OSPHORUS obtains and
build upon the small code base of MiniSat [16], which includes
uses these facts in various phases.
the standard CDCL, variable and clause elimination [17],
watched literals data structures [18] and the like. A. ANF propagation
The open-source tool is available at [Link] For each variable, we attempt to assign a value (0 or 1) or an
bosphorus equivalent literal by examining the polynomials involving the

978-3-9819263-2-3/DATE19/2019
c EDAA 468
variable. A value assignment can occur in two cases. First, x1 x2 ⊕x2 x3 ⊕1 becomes x1 ⊕x3 ⊕1, and the ANF propagation
for polynomial x or x ⊕ 1, we set x to the constants 0 or can deduce the equivalence x1 = ¬x3 .
1 respectively. Second, for polynomial xi1 xi2 . . . xip ⊕ 1, we Similar to XL, we apply ElimLin on a random subset of
set xi1 = xi2 = · · · = xip = 1. An equivalence assignment polynomials that has linearized size of approximately 2M .
happens if the polynomial is x ⊕ y or x ⊕ y ⊕ 1, in which case
we set x = y or x = ¬y respectively. These assignments are D. Conflict-bounded SAT solving
applied iteratively until a fixed point is reached. With a CNF equivalent of the ANF, we call a SAT solver
that has conflict-driven clause learning [15]. The solver is
B. eXtended Linearization (XL) allowed up to a pre-determined number C of conflicts to solve
the system. We bound the solver using use a conflict budget
Gauss-Jordan elimination (GJE) solves a system of linear
instead of a time budget for replicability of experiments.
equations by elementary row operations. For polynomials, one
Due to this budget, the solver will surely terminate with one
can apply GJE by treating each monomial as an indepen-
of these three cases: (1) unsatisfiable; (2) satisfiable, giving an
dent variable — this is known as linearization. Dependence
assignment; or (3) undecidable within the limit. In case (1),
between the monomials can be re-introduced by generating
B OSPHORUS appends the contradictory equation 1 = 0 to the
more polynomial equations, a process known as eXtended
system — this is the learnt fact by the SAT solver. In cases
Linearization (XL) [9]. We describe XL and how it is used.
(2) and (3), B OSPHORUS extracts linear equations from learnt
Given a polynomial system S with n variables and m
clauses — of particular interest are linear equations from the
equations, we expand S incrementally to obtain an expanded
unit and binary clauses because they immediately yield value
system S  . The expansion process selects each equation in S
and equivalence assignments.
in ascending degree order and multiplies the equation with
all possible monomials up to a chosen degree D. In the case E. Example
where we manage to expand S fully, the expanded system will
D   Consider the ANF
have m j=0 nj polynomials. GJE is then applied on S  .
x1 x2 ⊕ x3 ⊕ x4 ⊕ 1, x1 x2 x3 ⊕ x1 ⊕ x3 ⊕ 1,
Table I shows an example of applying XL on the ANF
{x1 x2 ⊕ x1 ⊕ 1, x2 x3 ⊕ x3 }, expanding up to degree D = 1 x1 x3 ⊕ x3 x4 x5 ⊕ x3 , x2 x3 ⊕ x3 x5 ⊕ 1, (1)
monomials. The last three rows of Table Ib are the facts {x1 ⊕ x2 x3 ⊕ x5 ⊕ 1.
1, x2 , x3 } that B OSPHORUS will retain.
XL with D = 1 on this system learns the facts x2 x3 x4 ⊕ 1,
Applying XL on the entire ANF often requires considerable
x1 x3 x4 ⊕ 1, x1 ⊕ x5 ⊕ 1, x1 ⊕ x4 , x3 ⊕ 1, and x1 ⊕ x2 . For
memory and time. To avoid this, we uniformly subsample the
ElimLin, its initial GJE — step (1) in section II-C — gives
polynomials from the ANF to obtain an m -by-n linearized
four distinct linear equations: x1 ⊕x5 ⊕1; x1 ⊕x4 ; x3 ⊕1; and
system S such that m n  2M , for a fixed parameter M .
x1 ⊕x2 . After substituting x5 by x1 ⊕1, x4 by x1 , x3 by 1 and
Moreover, S is incrementally expanded only until the system
x2 by x1 , ElimLin learns x1 ⊕ 1. Converting to CNF using
size is approximately 2M +δM , for a parameter δM .
Karnaugh map (section III-C) creates one auxiliary variable
We employ XL in this manner because our primary purpose for x1 x2 . Boolean constraint propagation in the SAT solver
is not to solve the system but to learn facts to augment it. We then gives x2 ⊕ 1, x4 ⊕ 1, x5 , and x1 x2 ⊕ 1.
also employ ElimLin and SAT solver in the same spirit. ANF propagation using the above facts obtained from XL,
ElimLin and SAT solver simplifies the system into
C. ElimLin
ElimLin [11] is an algorithm that iterates through the x1 ⊕ 1, x2 ⊕ 1, x3 ⊕ 1, x4 ⊕ 1, x5 . (2)
following three steps until fixed point: (1) apply GJE on the This effectively solves the system to its unique satisfying
linearization of the polynomial system S; (2) gather linear assignment x1 = x2 = x3 = x4 = 1 and x5 = 0.
equations and remove them from S, yielding S  ; and (3) for Observe that ANF propagation after the XL step would
each linear equation , pick, say, a variable from  that occurs have led to (2) without the need for either ElimLin or SAT
in the least number equations in S  , and eliminate that variable solver. Nevertheless, the above example illustrates that each
from S  using . The resultant system S  is free of linear can derive different learnt facts: XL gives the value assign-
equations. The process is repeated from step (1) using S  as ment for x3 , ElimLin gives that for x1 , and the SAT solver
S until there are no more linear equations after applying GJE. learns the remaining assignments. To make full use of these
Consider the ANF {x1 ⊕x2 ⊕x3 , x1 x2 ⊕x2 x3 ⊕1}. As step different learnt facts, B OSPHORUS is designed to perform ANF
(1) does not affect the system, x1 ⊕ x2 ⊕ x3 remains the only propagation when learnt facts are produced after every step.
linear equation in step (2). If we choose to substitute x1 by
x2 ⊕x3 in step (3), the ANF becomes the single equation (x2 ⊕ III. B OSPHORUS
x3 )x2 ⊕ x2 x3 ⊕ 1. By right-distributing the first conjunction This section details the workflow and the data structures of
over the first XOR and then replacing the XOR of x2 x3 with B OSPHORUS, and the approaches to convert between ANFs
itself by zero, this equation simplifies to x2 ⊕ 1. Assigning and CNFs. The source code is available at [Link]
x2 = 1 and performing ANF propagation on the original ANF, meelgroup/bosphorus.

Design, Automation And Test in Europe (DATE 2019) 469


TABLE I: An example of applying eXtended Linearization (XL). Zero coefficients in the matrices are suppressed; and rows
corresponding to zero polynomials are omitted. The last three rows of (b) are the facts that will be retained.
(a) Expansion by degree 1 monomials (b) Gauss-Jordan Elimination

Expanded linearized system Linearized system after GJE


Polynomial Multiplier x1 x 2 x 3 x2 x3 x1 x3 x1 x2 x3 x2 x1 1 x1 x2 x3 x2 x3 x1 x3 x1 x2 x3 x2 x1 1
x1 x2 ⊕ x1 ⊕ 1 1 1 1 1 1 1
x1 1 1
x2 1 1
x3 1 1 1 1
x2 x3 ⊕ x3 1 1 1 1
x1 1 1 1 1
x3 1 1

Problem Convert x1 in (1) do not involve processing the last two equations. The
description to ANF time saved can be significant for large polynomial systems.
C. ANF to CNF conversion
yes Processed
SAT Solver Fixed Point
ANF and CNF
CNF is used by the SAT solver within B OSPHORUS, and it
is also an output. To convert ANF to CNF, we introduce an
no
auxiliary CNF variable on-the-fly for each ANF monomial,
ElimLin XL and we maintain a bi-directional map for such variables.
B OSPHORUS handles determined variables, equivalences,
Fig. 1: B OSPHORUS’s flow. A dashed arrow means ANF and polynomials differently in the conversion. Determined
propagation is applied. variables are added as unit clauses, while an equivalence such
as xi = ¬xj is represented in CNF by (xi ∨xj )∧(¬xi ∨¬xj ).
For a polynomial, it is first re-expressed as shorter ones
A. Workflow by introducing auxiliary variables. The number of terms in
B OSPHORUS takes a problem encoded in ANF and pro- the shorter polynomials is parameterized by an XOR-cutting
duces a processed ANF and CNF after performing an XL– length L, Then, each of these shorter polynomials is converted
ElimLin–SAT-solver fact-learning loop until the fixed point to CNF using either of the following two approaches:
when no further learnt facts are produced. ANF propagation 1) If the polynomial is K-variate, we use the Karnaugh
is performed on the input ANF and whenever learnt facts are map to yield the minimal clause representation while re-
produced. Fig. 1 shows the overall workflow. ducing the number of auxiliary variables used. Because
Internally within B OSPHORUS, the problem is represented computing the Karnaugh map scales exponentially with
as an ANF polynomial system, and only ANF propagation the number of variables, the Karnaugh parameter K is
modifies and replaces this master copy. Each of the other kept low to ensure reasonable conversion time.
techniques — XL, ElimLin and SAT solver — operates on 2) If the polynomial involves more than K variables, we
a copy of the ANF, and learnt facts are extracted and then apply a transformation à la Tseitin encoding [22]. Each
added onto the master copy if not already there. polynomial of length l ≤ L is treated as an XOR clause
If the equation 1 = 0 is detected, B OSPHORUS terminates of independent terms and converted to CNF clauses by
and returns UNSAT. If the SAT solver finds a satisfying enumerating through all possible 2l terms.
solution, B OSPHORUS stores the solution. This solution is not Although the Karnaugh map approach is less flexi-
used to simplify the ANF because it may not be unique. ble, it can yield a more compact conversion than the
B. Data structures Tseitin-based approach. Consider the polynomial equation
x1 x3 ⊕ x1 ⊕ x2 ⊕ x4 ⊕ 1 = 0. Fig. 2 shows possible CNF
B OSPHORUS stores the system of equations in the ANF representations via both approaches. Using the Karnaugh map
description as a list of Boolean polynomials. For each variable, shown in Fig. 3, one can derive a more compact CNF system
we track (i) its value, as either 0, 1, or undetermined; (ii) its that directly deals with the variables involved. In comparison,
equivalence literal; and (iii) its occurrence list. the Tseitin-based approach creates a new CNF variable x5 and
The default equivalence literal for each variable is the encode x5 = x1 x3 using three CNF clauses.
variable itself and may change as B OSPHORUS proceeds. For At present, any auxiliary variable introduced in the conver-
example, the equivalence literal of xi may be switched to ¬xj sion process does not participate in the learnt facts.
to encode xi = ¬xj .
Occurrence list is an optimization technique from the SAT D. CNF to ANF conversion
literature [18], [21]. Here, B OSPHORUS tracks the list of poly- B OSPHORUS can be used as a CNF preprocessor, though its
nomials that each variable occurs in. For example, updates to main use-case is that of solving problems represented in ANF.

470 Design, Automation And Test in Europe (DATE 2019)


PolyBoRi[25] To store and manipulate Boolean polynomials.
x1 ∨ ¬x5
M4RI [26], [27] For efficient Gauss-Jordan elimination on
x3 ∨ ¬x5
Boolean matrices, necessary for XL and ElimLin.
¬x1 ∨ ¬x3 ∨ x5
x1 ∨ x2 ∨ x4 CryptoMiniSat5 [28] This is a SAT solver equipped with
x1 ∨ x2 ∨ x4 ∨ x5
¬x1 ∨ ¬x2 ∨ x3 ∨ x4 conflict-driven clause learning. To extract learnt facts
¬x1 ∨ ¬x2 ∨ x4 ∨ x5
x2 ∨ ¬x3 ∨ x4 from this solver, we modify version 5.6.3 of the solver
¬x1 ∨ x2 ∨ ¬x4 ∨ x5
¬x1 ∨ x2 ∨ x3 ∨ ¬x4 to exposed its APIs that extract linear equations.
x1 ∨ ¬x2 ∨ ¬x4 ∨ x5
x1 ∨ ¬x2 ∨ ¬x4 ESPRESSO [29] For Karnaugh map simplification [30].
¬x1 ∨ x2 ∨ x4 ∨ ¬x5
¬x2 ∨ ¬x3 ∨ ¬x4 While ESPRESSO is a heuristic logic minimizer, it is
x1 ∨ ¬x2 ∨ x4 ∨ ¬x5
fast and often yields close-to-optimum representations.
x1 ∨ x2 ∨ ¬x4 ∨ ¬x5
¬x1 ∨ ¬x2 ∨ ¬x4 ∨ ¬x5 IV. E XPERIMENTS AND R ESULTS
We run experiments on three classes of problem described
Fig. 2: ANF-to-CNF conversions of polynomial x1 x3 ⊕ x1 ⊕ in ANFs and a set of problems in CNFs. The ANF problems
x2 ⊕x4 ⊕1. (Left) Karnaugh map conversion (6 CNF clauses); are round-reduced AES cipher, round-reduced SIMON cipher
(Right) Tseitin-based conversion (11 CNF clauses). and weakened Bitcoin nonce finding, while the CNF problems
consist of a wide variety from the SAT Competition 2017 [31].
x̄3 x̄4 x̄3 x4 x3 x4 x3 x̄4 These problems are detailed in the appendix. The experiments
are conducted on a single Intel Xeon E5-2670v2 2.50GHz
x̄1 x̄2 1 0 0 1 processor core. Each ANF or CNF is passed to B OSPHORUS,
which, after learning facts using the XL–ElimLin–SAT-solver
x̄1 x2 0 1 1 0 loop together with ANF propagation, will give a CNF that
includes the learnt facts. A SAT solver is then used to solve
x 1 x2 1 0 1 0 the processed CNF eventually. Note that the most efficient
off-the-shelf ANF solver, M4GB, has such a high memory
x1 x̄2 0 1 0 1 footprint that it times out on all the instances.
We also pass the instances to the SAT solvers directly
without learning facts but only converting to CNFs using
Fig. 3: Karnaugh map of polynomial x1 x3 ⊕ x1 ⊕ x2 ⊕ x4 ⊕ 1. B OSPHORUS if needed. We also evaluate with three different
SAT solvers for the eventual solving: a minimalistic SAT
solver MiniSat [16], a high-performance SAT solver Lin-
When used as a CNF preprocessor, B OSPHORUS obtains an geling [32], and CryptoMiniSat5 [28], which natively performs
equivalent ANF in the following manner [23]: Gauss-Jordan elimination.1 We report the PAR-2 score [31]
1) Each CNF variable is assigned a unique ANF variable; and the number of solved instances. The PAR-2 score is the
2) Each clause is converted to a polynomial via product of sum of runtimes for solved instances and twice the timeout
negated literals. for unsolved instances, and a lower score is better.
For instance, the polynomial for the clause ¬x1 ∨ x2 is For the B OSPHORUS’s workflow, we use the following
(x1 )(x2 ⊕ 1) = x1 x2 ⊕ x1 . The resultant polynomial degree parameters: XL and ElimLin subsampling parameter M = 30,
is the number of literals in each clause. More importantly, if XL expansion allowance δM = 4 and degree D = 1, Kar-
a clause has n positive literals, there will be 2n terms in the naugh parameter K = 8, cutting lengths L = L = 5, and
polynomial. To prevent such cases, we re-express the clause SAT-solver conflict budget starting from C = 10, 000, increas-
as a set shorter of clauses by introducing auxiliary variables ing up to 100, 000 in increments of 10, 000 when the learnt
à la converting a k-SAT to 3-SAT. We limit the number of clauses from the SAT-solver produce no new learnt facts.
positive literals within each of the shorter clauses to L , called Moreover, we make B OSPHORUS exit the loop and provide
the clause-cutting length. Each of the shorter clauses is then the solution if the SAT solver finds a satisfying assignment.
converted to polynomials as outlined above. We limit the total time used for each instance to 5,000 seconds,
This CNF-to-ANF conversion is trivial, unlike that in [24]; with B OSPHORUS given at most 1,000 seconds.
sophisticated techniques are then applied to simplify the prob- We only present results for selected benchmarks in Table II.
lem on the ANF level. In this use-case, converting problem The first column represents the class of benchmarks followed
from CNF to ANF and back to CNF give a suboptimal de- by the number of instances in parenthesis. For each problem
scription of the original problem. Hence, B OSPHORUS returns class, we have two rows of results: the first without using
the original CNF in addition to the one converted from its B OSPHORUS and the second with. The third, fourth and fifth
internal ANF representation, which includes the learnt facts. columns specify the PAR-2 score (in thousands) for MiniSat,
Lingeling, and CryptoMiniSat5 respectively. The PAR-2 score
E. Implementation
B OSPHORUS uses the following existing work: 1 The versions used are 2.2, bcj-78ebb86-180517 and 5.6.3 respectively.

Design, Automation And Test in Europe (DATE 2019) 471


TABLE II: The PAR-2 score is shown in thousands (lower is techniques by plugging them as components into the workflow,
better) with, in parenthesis, the number of solved satisfiable for example, lookahead SAT solvers [33] and Buchberger’s
instances plus (if any) the number of solved unsatisfiable algorithm [3]. In fact, using the Buchberger’s algorithm as a
instances. For each problem set, there are two rows of results: preprocessor for SAT solving has previously been proposed
the first without using B OSPHORUS (labeled w/o in the second [24], but, with B OSPHORUS, it may now be applied in an
column) and the second with (labeled w). The better of the two iterative manner together with other solving techniques.
is in bold, with preference to the number of solved instances. To conclude, we have proposed and implemented a tool
Problem MiniSat Lingeling CryptoMiniSat5 named B OSPHORUS that iteratively applies eXtended lin-
earization, ElimLin and conflict-bounded SAT solving together
SR-[1,4,4,8] w/o 4372 ( 89) 532 (500) 504 (500)
(500) w 1099 (489) 518 (500) 507 (500) with ANF propagation in order to learn additional facts to
augment the original problem. The experiments on selected
Simon-[8,6] w/o 1 (50) 0 (50) 0 (50)
(50) w 3 (50) 3 (50) 3 (50) ANF and SAT problems have shown that this approach can
Simon-[9,7] w/o 324 (22) 0 (50) 2 (50) help solve more problems in a shorter time, particularly for
(50) w 15 (50) 14 (50) 14 (50) the harder instances.
Simon-[10,8] w/o 500 ( 0) 31 (50) 45 (50) Acknowledgment: We thank Joshua Wong for running the
(50) w 231 (34) 29 (50) 44 (50) M4GB experiments, Sze Ling Yeo for her Simon encoding, and
Bitcoin-[10] w/o 4 (50) 9 (50) 8 (50) Volodymyr Skladanivskyy for his SHA256 encoding. This work was
(50) w 23 (50) 23 (50) 24 (50)
supported in part by NUS ODPRT Grant R-252-000-685-133 and AI
Bitcoin-[15] w/o 146 (43) 185 (39) 169 (40) Singapore Grant R-252-000-A16-490.
(50) w 171 (42) 220 (34) 176 (41)
Bitcoin-[20] w/o 493 ( 1) 475 ( 3) 486 ( 2)
(50) w 482 ( 2) 471 ( 4) 477 ( 3) R EFERENCES
SAT-2017 w/o 2105 (75+38) 2006 (70+56) 1764 (89+63) [1] NIST, “Advanced Encryption Standard (AES),” FIPS PUB 197, 2001.
(310) w 2153 (72+42) 2070 (70+57) 1674 (98+77) [2] ——, “Secure Hash Standard (SHS),” FIPS PUB 180–4, 2015.
SAT-2017 w/o 2045 (15+ 7) 1738 (29+26) 1689 (30+32) [3] B. Buchberger, “Bruno Buchbergers PhD thesis 1965: An algorithm for
(219) w 1981 (18+11) 1756 (29+27) 1543 (40+46) finding the basis elements of the residue class ring of a zero dimensional
polynomial ideal,” Journal of Symbolic Computation, 2006.
[4] J.-C. Faugere, “A new efficient algorithm for computing Gröbner bases
(F4),” Journal of pure and applied algebra, 1999.
for the case of using B OSPHORUS includes time taken by [5] J. C. Faugère, “A new efficient algorithm for computing Gröbner bases
without reduction to zero (F5),” in Proceedings of ISSAC, 2002.
B OSPHORUS. The Simon, Bitcoin and SAT-2017 benchmark [6] R. H. Makarim and M. Stevens, “M4GB: an efficient Gröbner-basis
classes are listed in increasing difficulty. algorithm,” in Proceedings of ISSAC, 2017.
For the instances from SR-[1,4,4,8], B OSPHORUS allows a [7] J.-C. Faugère, “FGb: a library for computing Gröbner bases,” in Pro-
ceedings of ICMS, 2010.
significantly more solved instances for MiniSat, and it provides [8] W. Bosma, J. Cannon, and C. Playoust, “The Magma algebra system i:
similar PAR-2 scores for Lingeling and CryptoMiniSat5 even The user language,” Journal of Symbolic Computation, 1997.
while including its overhead. Similar observations can be [9] N. Courtois, A. Klimov, J. Patarin, and A. Shamir, “Efficient algorithms
for solving overdefined systems of multivariate polynomial equations,”
made for the harder Simon instances, though the overhead in Proceedings of Eurocrypt, 2000.
of B OSPHORUS is now clearly visible in Simon-[9,7]. With [10] N. T. Courtois and J. Pieprzyk, “Cryptanalysis of block ciphers with
Bitcoin, B OSPHORUS does not always help, but the effect of overdefined systems of equations,” in Proceedings of Asiacrypt, 2002.
[11] N. T. Courtois and G. V. Bard, “Algebraic cryptanalysis of the Data En-
its overhead to the PAR-2 scores diminishes with the harder cryption Standard,” in IMA International Conference on Cryptography
instances. One way to study when B OSPHORUS helps is to run and Coding, 2007.
it with different parameters. For the SAT-2017 CNF instances, [12] N. T. Courtois, P. Sepehrdad, P. Sušil, and S. Vaudenay, “ElimLin
B OSPHORUS does provide useful information to the solvers, algorithm revisited,” in Proceedings of FSE, 2012.
[13] S. Gao, Y. Guan, and F. Volny IV, “A new incremental algorithm for
especially for the UNSAT instances. computing Gröbner bases,” in Proceedings of ISSAC, 2010.
[14] M. R. Prasad, A. Biere, and A. Gupta, “A survey of recent advances in
V. D ISCUSSION SAT-based formal verification,” Proceedings of STTT, 2005.
[15] J. a. P. M. Silva and K. A. Sakallah, “GRASP - a new search algorithm
While B OSPHORUS can be used as a CNF preprocessor, it for satisfiability,” in Proceedings of CAV, ser. ICCAD ’96. IEEE
is in fact a flexible reasoning framework on Boolean or GF(2) Computer Society, 1996, pp. 220–227.
variables in the following sense. First, for satisfiable problems, [16] N. Eén and N. Sörensson, “An extensible SAT-solver,” in Proceedings
of SAT, 2003.
the SAT solver collapse onto one solution, while B OSPHORUS
[17] N. Eén and A. Biere, “Effective preprocessing in SAT through variable
can continuously constrain the solution space without com- and clause elimination,” in Proceedings of SAT, F. Bacchus and T. Walsh,
mitting to one particular solution. Second, for unsatisfiable Eds. Springer Berlin Heidelberg, 2005, pp. 61–75.
problems, the conclusion can be reached by either any of the [18] M. W. Moskewicz, C. F. Madigan, Y. Zhao, L. Zhang, and S. Malik,
“Chaff: Engineering an efficient SAT solver,” in Proceedings DAC, 2001.
ANF techniques giving 1 = 0 or by the SAT solver giving [19] K. S. Meel, M. Y. Vardi, S. Chakraborty, D. J. Fremont, S. A. Seshia,
UNSAT. Third, any of the solving techniques in the workflow D. Fried, A. Ivrii, and S. Malik, “Constrained sampling and counting:
can be improved with minimal impact on the other techniques Universal hashing meets SAT solving,” in AAAI Workshop: Beyond NP,
2016.
because the retained facts do not increase the complexity of the [20] K. S. Meel, Constrained Counting and Sampling: Bridging the gap
equations. Fourth, it is relatively easy to include new solving between Theory and Practice. Rice University, 2017, Ph.D. Thesis.

472 Design, Automation And Test in Europe (DATE 2019)


[21] H. Zhang, “SATO: An efficient prepositional prover,” in Proceedings of xi+1 xi
CADE, 1997. S1
[22] G. S. Tseitin, “On the complexity of derivation in propositional calcu-
lus,” in Automation of reasoning, 1983. & ⊕
[23] J. Hsiang, “Refutational theorem proving using term-rewriting systems,” S8
Artificial Intelligence, 1985.
[24] C. Condrat and P. Kalla, “A Gröbner basis approach to CNF-formulae S2 ⊕
preprocessing,” in Proc. of TACAS, 2007.
[25] M. Brickenstein and A. Dreyer, “PolyBoRi: A framework for Gröbner-
basis computations with Boolean polynomials,” Journal of Symbolic
⊕ ki

Computation, 2009.
xi+2 xi+1
[26] M. Albrecht and G. Bard, The M4RI Library – Version 20121224, The
M4RI Team, 2012. [Online]. Available: [Link]
[27] M. Albrecht, G. Bard, and C. Pernet, “Efficient dense Gaus- Fig. 4: One Fiestel round of Simon cipher. Diagram from [36].
sian elimination over the finite field with two elements,” 2011,
arXiv:1111.6549v1[[Link]]. 448 bits 64 bits
[28] M. Soos, “The CryptoMiniSat 5 set of solvers at SAT competition 2016,”
in SAT Competition 2016 – Solver and Benchmark Descriptions, 2016. Message M Nonce 1 |M |
[29] R. K. Brayton, G. D. Hachtel, C. McMullen, and A. Sangiovanni-
Vincentelli, Logic minimization algorithms for VLSI synthesis, 1984. Randomly fixed 415 bits 32-bit Size of M
[30] M. Karnaugh, “The map method for synthesis of combinational logic nonce in binary
circuits,” Transactions of the American Institute of Electrical Engineers,
Part I: Communication and Electronics, 1953. Fig. 5: Our nonce-finding setup.
[31] T. Balyo, M. Heule, and M. Järvisalo, Eds., SAT Competition 2017 –
Solver and Benchmark Descriptions. University of Helsinki, 2017, vol.
B-2017-1.
[32] A. Biere, “CaDiCaL, Lingeling, Plingeling, Treengeling, YalSAT Enter- we toggle the ith in the right-half of P1 , for i ∈ {2, . . . , n}.
ing the SAT Competition 2017,” in SAT Competition 2017 – Solver and This set of problems is parameterized by (n, r), where n is
Benchmark Descriptions, 2017.
[33] A. Biere, A. Biere, M. Heule, H. van Maaren, and T. Walsh, Handbook the number of plaintexts, and r is the number of rounds.
of Satisfiability: Volume 185 Frontiers in Artificial Intelligence and
Applications, 2009. C. Cryptographic hash functions — 50 instances per k
[34] C. Cid, S. Murphy, and M. J. Robshaw, “Small scale variants of the Recently, Cryptographically secure hash functions have
AES,” in Proceedings of FSE, 2005.
[35] The Sage Developers, SageMath, the Sage Mathematics Software System been used to serve as proof-of-work in blockchains and
(Version 8.1), 2017, [Link] cryptocurrencies, of which Bitcoin is an example. Bitcoin [38]
[36] R. Beaulieu, S. Treatman-Clark, D. Shors, B. Weeks, J. Smith, and uses SHA256, a hash function in the SHA-2 hash family [2].
L. Wingers, “The Simon and Speck lightweight block ciphers,” in
Proceedings of DAC, 2015.
We consider a weakened version of the Bitcoin block
[37] N. Courtois, T. Mourouzis, G. Song, P. Sepehrdad, and P. Susil, “Com- hashing algorithm. Let M be a 512-bit input message, and
bined algebraic and truncated differential cryptanalysis on reduced-round H be a 256-bit hash output. We randomly set the first 415
Simon,” in Proceedings of SECRYPT, 2014.
[38] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” Tech.
bits of M , allow the next 32-bit nonce to be free (but to
Rep., 2008. be determined), and pad according to SHA padding (add
‘1’, then encode |M | = 448 in the next 64 bits). Given k,
A PPENDIX the challenge is then to solve for a suitable 32-bit nonce
A. Round-reduced AES cipher — 500 instances of M that results in a hash H with the first k bits being
We obtain a parameterized ANF encoding of AES [34] from 0. We construct challenges in this manner because Bitcoin
SageMath [35]. Using parameters (n, r, c, e) = (1, 4, 4, 8), uses 32-bit nonces to solve for hashes starting with varying k
we generate 500 ANF instances for 1-round AES. First, 500 zeroes. See Fig. 5 for an illustration. We generate instances for
random pairs of plaintext (P ) and key (K) bits are generated k = {10, 15, 20} using the generic ANF encoding available at
and simulated to yield the corresponding ciphertext (C) bits. [Link]
The resultant ANF has 800 variables and 1120 equations — D. Instances from SAT 2017 Competition
864 equations and 256 bit assignments from (P, C).
We preprocess g2-hwmcc15deep-beemfwt4b1-k48 and
B. Round-reduced Simon cipher — 50 instances per (n, r) g2-hwmcc15deep-beemlifts3b1-k29 using CryptoMin-
Simon [36] is a family of lightweight Feistel-based block iSat5 to reduce the number of variables to less than 1,048,574
ciphers. The round functions are described in conjunction variables, which is the maximum number of variables that
and exclusive-OR of bits, allowing a straightforward ANF the P OLY B O R I data structure can handle on our platforms.
encoding; see Fig. 4. This set of benchmarks are reduced We omit the 40 CNFs with names of the pattern g2-T∗
rounds Simon32/64 with multiple plaintext-ciphertext pairs because they each have too many variables even after the
encoded under the same randomly generated secret key. preprocessing. We also omit mp1-bsat222-777 because it
Simon32/64 takes a 32-bit plaintext (P ) and a 64-bit key is not a well-formed DIMACS file. Hence, we experiment
to return a 32-bit ciphertext. For each instance, we generate on 310 instances altogether. From these, we select difficult
n ≤ 17 plaintexts with low hamming distance as per the Sim- instances: using the runtime of MiniSat (without B OSPHORUS)
ilar Plaintexts/Random Ciphertexts (SP/RC) setting in [37]. as a proxy difficulty measure, we select the 219 that requires
Concretely, the first plaintext P1 is uniformly sampled while more than 2,500 seconds.

Design, Automation And Test in Europe (DATE 2019) 473

You might also like