Heuristic Optimisation in Design and Analysis

Two-Stage Optimisation in theDesign of Boolean Functions

John A Clark and Jeremy L JacobDept. of Computer Science

University of York, UK

jac@cs.york.ac.ukjeremy@cs.york.ac.uk

Overview

Optimisation

general introduction

hill-climbing

simulated annealing.

Boolean function design (reprise)

Experimental approach and results.

Conclusions and future work.

Optimisation

Subject of huge practical importance. An optimisationproblem may be stated as follows:

Find the value x that maximises the function z(y) over D.

Given a domain D and a function z: D  

find x in D such that

z(x)=sup{z(y): y in D}

Optimisation

Traditional optimisation techniques include:

calculus (e.g. solve differential equations for extrema)

f(x)= -3 x2+6x solve f '(x)=-6x+6=0 to obtain x=1 withmaximum f(x)=3

hill-climbing: inspired by notion of calculus

gradient ascent etc.

(quasi-) enumerative:

brute force (a crypto-favourite)

linear programming

branch and bound

dynamic programming

Optimisation Problems

Traditional techniques not without their problems

assumptions may simply not hold

e.g. non-differentiable discontinuous functions

non-linear functions

problem may suffer from ‘ curse (joy?) of dimensionality ’ - theproblem is simply too big to handle exactly (e.g. by brute force ordynamic programming). NP hard problems.

Some techniques may tend to get stuck in local optima for non-linear problems (see later)

The various difficulties have led researchers to investigatethe use of heuristic techniques typically inspired bynatural processes that typically give good solutions tooptimisation problems (but forego guarantees).

Heuristic Optimisation

A variety of techniques have been developed to deal withnon-linear and discontinuous problems

highest profile one is probably genetic algorithms

works with a population of solutions and breeds new solutionsby aping the processes of natural reproduction

Darwinian survival of the fittest

proven very robust across a huge range of problems

can be very efficient

Simulated annealing - a local search technique based on coolingprocesses of molten metals (used in this paper)

Will illustrate problems with non-linearity and thendescribe simulated annealing.

Local Optimisation - Hill Climbing

Let the current solution be x.

Define the neighbourhood N(x) to be the set of solutionsthat are ‘close’ to x

If possible, move to a neighbouring solution that improvesthe value of z(x), otherwise stop.

Choose any y as next solution provided z(y) >= z(x)

loose hill-climbing

Choose y as next solution such that z(y)=sup{z(v): v in N(x)}

steepest gradient ascent

Local Optimisation - Hill Climbing

z(x)

Neighbourhood of a point xmight be N(x)={x+1,x-1}Hill-climb goes x0  x1  x2 sincef(x0)<f(x1)<f(x2) > f(x3)and gets stuck at x2 (local optimum)

xopt

Really want toobtain xopt

Simulated Annealing

z(x)

Allows non-improving moves so thatit is possible to go down

x11

x10

x12

x13

in order to rise again

to reach global optimum

Simulated Annealing

Allows non-improving moves to be taken in the hope ofescaping from local optimum.

Previous slide gives idea. In practice the size of theneighbourhood may be very large and a candidateneighbour is typically selected at random.

Quite possible to accept a worsening move when animproving move exists.

Simulated Annealing

Improving moves always accepted

Non-improving moves may be accepted probabilisticallyand in a manner depending on the temperature parameterTemp. Loosely

the worse the move the less likely it is to be accepted

a worsening move is less likely to be accepted the cooler thetemperature

The temperature T starts high and is gradually cooled asthe search progresses.

Initially virtually anything is accepted, at the end only improvingmoves are allowed (and the search effectively reduces to hill-climbing)

Simulated Annealing

Current candidate x.

At each temperature consider 400 moves

Always acceptimproving moves

Accept worseningmoves probabilistically.

Gets harder to do thisthe worse the move.

Gets harder as Tempdecreases.

Temperaturecycle

Crypto and Heuristic Optimisation

Most work on cryptanalysis attacking variety of simpleciphers - simple substitution and transposition throughpoly-alphabetic ciphers etc.

more recent work in attacking NP hard problems

But perhaps most successful work has been in design ofcryptographic elements.

Most work is rather direct in its application.

Boolean Function Design

A Boolean function

For present purposeswe shall use the polarrepresentation

-1

f(x)

Will talk only aboutbalanced functionswhere there are equalnumbers of 1s and -1s.

Preliminary Definitions

Definitions relating to a Boolean function f of n variables

Walsh Hadamard

Linear function

L(x)=1x1…  nxn

L(x)=(-1)

L(x)

(polar form)

Preliminary Definitions

Non-linearity

Auto-correlation

For present purposes we need simply note that these can beeasily evaluated given a function f. They can therefore beused as the functions to be optimised. Traditionally theyare.

ACf=max | f(x)f(x+s) |

Using Parseval’s Theorem

Parseval’s Theorem

Loosely, push down on F()2 for some particular  and itappears elsewhere.

Suggests that arranging for uniform values of F()2 willlead to good non-linearity. This is the initial motivation forour new cost function.

NEW FUNCTION!

Moves Preserving Balance

Start with balanced (but otherwise random) solution. Move strategy preservesbalance

Neighbourhood of aparticular function f tobe the set of allfunctions obtained byexchanging (flipping)any two dissimilarvalues.

Here we have swappedf(2) and f(4)

-1

f(x)

-1

g(x)

-1

Getting in the Right Area

Previous work (QUT) has shown strongly

Heuristic techniques can be very effective for cryptographic designsynthesis

Boolean function, S-box design etc

Hill-climbing works far better than random search

Combining heuristic search and hill-climbing generally gives bestresults

Aside – notion applies more generally too - has led todevelopment of memetic algorithms in GA work.

GAs known to be robust but not suited for ‘fine tuning’.

We will adopt this strategy too: use simulated annealing toget in the ‘right area’ then hill-climb.

But we will adopt the new cost function for the first stage.

Hill-climbing With Traditional CF (n=8)

Varying the Technique (n=8)

Non-linearity

Autocorrelation

Simulated AnnealingWith Traditional CF

Simulated AnnealingWith New CF

Simulated AnnealingWith New CF+Hill Climbing WithTraditional CF

Tuning the Technique

Experience has shown that experimentation is par for thecourse with optimisation.

Initial cost function motivated by theory but the real issueis how the cost function and search technique interact.

Have generalised the initial cost function to give aparametrised family of new cost functions

Cost(f)=||F()|-(2 n/2+K)| R

Tuning the Technique (n=8)

Non-linearity

Autocorrelation

Illustration of how results change as K is varied400 runs

Tuning the Technique (n=8)

Non-linearity

Autocorrelation

Further illustration of how results change as K is varied. 100 Runs

Comparison of Results

Summary and Conclusions

Have shown that local search can be used effectively for acryptographic non-linear optimisation problem - BooleanFunction Design.

‘Direct’ cost functions not necessarily best.

Cost function is a means to an end.

Whatever works will do.

Cost function efficacy depends on problem, problem parameters,and the search technique used.

You can take short cuts with annealing parameters (andcomputationally there may be little choice)

Experimentation is highly beneficial

should look to engaging theory more?

Future Work

Opportunities for expansion:

detailed variation of parameters

use of more efficient annealing processes (e.g. thermostatisticalannealing).

evolution of artefacts with hidden properties (you do not need to behonest - e.g. develop S-Boxes with hidden trapdoors)

experiment with different cost function families

multiple criteria etc.

evolve sets of Boolean functions

other local techniques (e.g. tabu search, TS)

more generally, when do GAs, SA, TS work best?

investigate non-balanced functions.