0% found this document useful (0 votes)

24 views88 pages

Resilient Data Structures Overview

The document discusses resilient data structures, focusing on the challenges posed by memory faults and the historical context of computing with unreliable components. It introduces various models and solutions for handling memory errors, including resilient dictionaries and priority queues, while highlighting the importance of robustness in algorithms. The talk also outlines the impact of memory faults on system performance and security, and presents strategies for designing resilient data structures that can tolerate faults without excessive data replication.

Uploaded by

hitesh Kag

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views88 pages

Resilient Data Structures Overview

Uploaded by

hitesh Kag

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Resilient Data Structures

Francesco Silvestri

Department of Information Engineering

University of Padova
silvest1@[Link]

Workshop on Recent Advances in Data Structures

December 17-20th, 2011

The origins

Computing with unreliable information / faulty components dates back to

the 50s

Von Numann,Probabilistic Logics and the

Synthesis of Reliable Organisms from Unreliable
Components, 1956

F. Silvestri (UniPD) Resilient Data Structures DS Meet 2 / 78

Which components?

Processors

S.M. Ulam, Adventures of a Mathematician 1977

F. Silvestri (UniPD) Resilient Data Structures DS Meet 3 / 78

Which components?

Network nodes/links

Andrew C. Yao and F. Frances Yao, On

Fault-Tolerant Networks for Sorting 1985

F. Silvestri (UniPD) Resilient Data Structures DS Meet 4 / 78

Which components?

Memories

Memory fault
One or more bits read differently from how they were last written

Due to:
transient electronic noises: electrical or magnetic interference: e.g.,
cosmic rays
hardware problems: e.g., permanently damaged bit
corruption in data path between memories and processing units

F. Silvestri (UniPD) Resilient Data Structures DS Meet 5 / 78

This talk

Introduction to memory faults

The faulty RAM model

Some resilient data structures:

Resilient dictionary

Resilient priority queue

Resiliency & cache-obliviousness

Open problems

F. Silvestri (UniPD) Resilient Data Structures DS Meet 6 / 78

Impact of memory errors: machine crashes

Machine crashes

F. Silvestri (UniPD) Resilient Data Structures DS Meet 7 / 78

Impact of memory errors: security

Security vulnerabilities
Breaking cryptographic protocols
[Blömer and Seifert, 2003]
Taking control over Java Virtual
Machine
[Govindavajhala and Appel, 2003]
Breaking smart cards
[Skorobogatov and Anderson, 2003]

F. Silvestri (UniPD) Resilient Data Structures DS Meet 8 / 78

Impact of memory errors: unpredictable output

Unpredictable output: an example...

MERGE (h1, 2, 3i, h4, 5, 6i)

⇓
17
2, 3i, h4, 5, 6i)
MERGE (h1,

⇓
h4, 5, 6, 17, 2, 3i

F. Silvestri (UniPD) Resilient Data Structures DS Meet 9 / 78

How common are memory errors?

F. Silvestri (UniPD) Resilient Data Structures DS Meet 10 / 78

A field study

In a field study by Google researchers [Schroeder et al., 2011]

Observed mean fault rates much higher than in laboratory conditions
25,000-70,000 faults per billion device hours per Mb
> 8% of DIMMs affected by faults per year

Small cluster of computers with few GB per node

one bit fault every few minutes

As memory size becomes larger, mean time between failures decreases

F. Silvestri (UniPD) Resilient Data Structures DS Meet 11 / 78

How to fight corruption?

F. Silvestri (UniPD) Resilient Data Structures DS Meet 12 / 78

Hardware vs software solutions

Hardware solution: error correcting codes (ECC)

$$$$$: large manufacturing and power costs
not always available
do not guarantee complete fault coverage: number of bit faults may
exceed ECC limit
Software solution: robustification
Redesign algorithms
Rewrite software
When faults occur: possibly longer execution, but space/time penalties
not too large

F. Silvestri (UniPD) Resilient Data Structures DS Meet 13 / 78

Some models of faulty memories

Liar model [Rényi, 1994, Ulam, 1977, Pelc, 2002]

two person game: how many comparison questions to find a number in
[1, 100] if the adversary can lie once or twice?
faults on operations, not on data
Sorting networks [Yao and Yao, 1985, Leighton and Ma, 1999]
Some comparison nodes may be faulty
Fault-tolerant pointer-based data structures
[Aumann and Bender, 1996]
Losing a single pointer can make an entire data structure unreachable
Error-correcting data structures [de Wolf, 2009]
Exploit ECCs to obtain space-time trade-offs
Checking model [Blum et al., 1991]
Can we design (on/off-line) checkers to report buggy behavior of data
structures using only a small (logarithmic) amount of reliable memory?

F. Silvestri (UniPD) Resilient Data Structures DS Meet 14 / 78

The Liar Model

Liar model: comparison questions answered by a

possibly lying adversary [Ulam, 1977, Rényi, 1994]

Different variants:
Types of question: comparison, subset inclusion,. . .
Types of lie: fixed number, probabilistic,. . .
Degree of interactivity between players
Sorting and searching well known. E.g. sorting with k lies:
Ω (n log n + kn) [Lakshmanan et al., 1991]
O (n log n) for k = O (log n/ log log n) [Ravikumar, 2002]
Lies ⇒ Transient failures ⇒ Algorithms can exploit query replication
strategies

Models faults on operations, not on data

F. Silvestri (UniPD) Resilient Data Structures DS Meet 15 / 78

Parallel Computing With Memory Faults

Processor/memory faults in parallel

settings
[Chlebus et al., 1994, Indyk, 1996]

Used models: PRAM / distributed memory machine

Static/dynamic deterministic/random faults
With fault-detection registers or limited adversary power

Simulation of fully operational models on faulty

models. Limited adversary.

F. Silvestri (UniPD) Resilient Data Structures DS Meet 16 / 78

Fault-Tolerant Sorting Networks

Fault-Tolerant Sorting Networks:

Comparators can be faulty and destroy one
of the input values [Yao and Yao, 1985]

With probabilistic faults [Assaf and Upfal, 1991]

O n log2 n nodes, depth O (log n)

Tight bound [Leighton and Ma, 1999]

Θ (log n) copies of each item
Uses fault-free replicators

Redundancy should be reduced.

Model faults on operations.

F. Silvestri (UniPD) Resilient Data Structures DS Meet 17 / 78

Pointer-Based Data Structures

Pointer-based data structures highly

non-resilient

Resilient pointer-based data structures [Aumann and Bender, 1996]

Faults are detected by the system
Resilient stacks, linked list, binary search tree
Based on connectivity property of the butterfly (i.e., FFT DAG)

A limited amount of uncorrupted data may be

lost upon the occurrence of a fault

F. Silvestri (UniPD) Resilient Data Structures DS Meet 18 / 78

Error-correcting data structures

Use ECC for restoring faults

[de Wolf, 2009]

Provide data structures for equality, membership, substring, inner

product
Trade-off number of probes (time) and space

Assume no safe memory. Only ECC.

F. Silvestri (UniPD) Resilient Data Structures DS Meet 19 / 78

Checkers

Design a checker that is able to detect error

in the behavior of a data structure
[Blum et al., 1991]

Detects faults; in case of faults, computation could not be restored

On-line checker: immediately after an operation
Off-line checker: at the end of a sequence
Checkers for stacks, queues and RAMs

Computation cannot be restored after a fault

F. Silvestri (UniPD) Resilient Data Structures DS Meet 20 / 78

We would like. . .

We would like to design algorithms and data structures

Resilient to δ faults
Resilient to faults inserted by a powerful adversary
Resilient to faults not recognizable by the system
May exploit O (1) safe memory

Provide (partial) correct solution even with faults

F. Silvestri (UniPD) Resilient Data Structures DS Meet 21 / 78

Which kind of solution?

Do we require the solution to be correct even

with faults?

Too much! We relax this assumption otherwise δ-replication is required

We require correctness (at least) on incorrect

data

Examples:
Sorting. Sort correctly uncorrupted data
Search: Is x in a set S? yes if there is an uncorrupted copy of x in S,
no if there are no uncorrupted values equal to x.

F. Silvestri (UniPD) Resilient Data Structures DS Meet 22 / 78

The faulty RAM

The Faulty RAM Model

[Finocchi and Italiano, 2004,
Finocchi and Italiano, 2008]

Memory fault: the correct value stored in a memory location is

altered (destructive faults)
Adversary with unbounded computational power: can corrupt up to δ
words 
 At any time
Fault appearance At any memory location
Simultaneously


Corrupted values indistinguishable from correct ones

F. Silvestri (UniPD) Resilient Data Structures DS Meet 23 / 78

The faulty RAM (2)

O (1) words of safe memory

cannot be corrupted by the adversary
can be read by the adversary

O (1) words of private memory

cannot be corrupted by the adversary
cannot be read by the adversary
useful for storing random bits

α: actual number of faults (α ≤ δ)

F. Silvestri (UniPD) Resilient Data Structures DS Meet 24 / 78

Why not data replication?

Data replication
Data replication can be quite inefficient in certain highly dynamic
scenarios, especially if objects to be replicated are large and complex

What can we do without (or limited) data replication?

E.g., with respect to sorting:

Q1 Can we sort the correct values in the presence of, e.g.,
polynomially many memory faults?
Q2 How many faults can we tolerate in the worst case if we wish
to maintain optimal time and space?
δ-resilient variable x
Write 2δ + 1 copies
Read by majority in O (1) safe memory: cannot be corrupted!
F. Silvestri (UniPD) Resilient Data Structures DS Meet 25 / 78
Some results in literature

Sorting (mergesort & quicksort) Counting

Searching (binary search & K -d trees
dictionaries) Interval trees
Priority queues Suffix trees
Dynamic programming ...

F. Silvestri (UniPD) Resilient Data Structures DS Meet 26 / 78

The rest of this talk

1 Resilient dictionary [Finocchi et al., 2007]

2 Resilient priority queue [Jørgensen et al., 2007]

F. Silvestri (UniPD) Resilient Data Structures DS Meet 27 / 78

Some results

Sorting [Finocchi and Italiano, 2008, Finocchi et al., 2009]

Θ n log n + δ 2 optimal
√
if δ = O n log n no time blow-up
Searching [Finocchi and Italiano, 2008, Jørgensen et al., 2007]
Θ (log n + δ) optimal
if δ = O (log n) no time blow-up
Counting [Brodal et al., 2009b]
Many counters, o(δ) space, one safe word
Small additive error
K -d trees [Gieseke et al., 2010]
Similar to the resilient search tree in the dictionary we will see
Used for clustering
Suffix/Interval trees [Christiano et al., 2011]
Exploit trade-off between ECC and replication

F. Silvestri (UniPD) Resilient Data Structures DS Meet 28 / 78

RESILIENT DICTIONARY

F. Silvestri (UniPD) Resilient Data Structures DS Meet 29 / 78

Resilient Dictionary

Operations
search(x):
return yes if there is an uncorrupted key x
return no if there isn’t an uncorrupted key x
If there is a corrupted key x, the behavior is not defined
insert(x), delete(x): defined as usual

[Finocchi et al., 2007]:

O (log n + αδ) amortized time/operation
O (n + δ) space
We will see a simpler implementation: O log n + αδ 2 amortized

time/operation

F. Silvestri (UniPD) Resilient Data Structures DS Meet 30 / 78

Main difficulties

Take wrong search direction upon reading a corrupted value

10
5 20 search(8) = false

2 8

Unsafe pointers:
point to wrong addresses (even outside the tree)
point to wrong nodes (tree structure?)
loosing a pointer ⇒ loosing part of the data

F. Silvestri (UniPD) Resilient Data Structures DS Meet 31 / 78

Unsafe pointers: a naı̈f approach

Replicate the tree 2δ + 1 times

10 10 10 10 10

5 20 5 20 5 20 5 20 5 20

2 8 2 8 2 8 2 8 2 8

At each step follow the majority value of the 2δ + 1 copies ⇒ correct

since we can have at most δ memory faults

O (δ log n) time
Too expensive
O (δn) space
It would be ok if the tree contains n/δ nodes
Push Θ (δ) keys/node

F. Silvestri (UniPD) Resilient Data Structures DS Meet 32 / 78

The data structure: ingredients

Group keys into disjoint intervals spanning the key space

(−∞, −10], (−10, 2], (2, 9], (9, 30], (30, 50], (50, +∞)

Each interval contains Θ (δ) keys (except possibly for the boundaries)
Say, at least δ/2 and at most 2δ

Intervals maintained in a (balanced) binary search tree (e.g. AVL

trees)

Tree stored reliably: each pointer and relevant info replicated 2δ + 1

times

Keys are not stored resiliently

Nodes stored in an array using doubling (check reliably in O (1) time

if a link point to a tree node)

F. Silvestri (UniPD) Resilient Data Structures DS Meet 33 / 78

Example: search(15)

F. Silvestri (UniPD) Resilient Data Structures DS Meet 34 / 78

Space usage

δ/2 ≤ number of keys per node ≤ 2δ (but boundary)

⇓
O (n/δ) nodes
⇓
Θ (δ) space per node
⇓
Linear space: O (n + δ)

F. Silvestri (UniPD) Resilient Data Structures DS Meet 35 / 78

Searching a key γ: the algorithm

search(γ)

1 Interval search
search for the interval that should contain γ (target node)
2 Key search
search γ in the list of keys of the target node

F. Silvestri (UniPD) Resilient Data Structures DS Meet 36 / 78

Example: search(15)

F. Silvestri (UniPD) Resilient Data Structures DS Meet 37 / 78

Useful tests

Given a key γ and a node v , we can check:

v = target node ⇒ γ ∈ I (v )

target node ∈ tree(v ) ⇒ γ ∈ U(v )

v ancestor of w ⇒ U(w ) ⊆ U(v )


 Unreliably: O (1) time
Tests can be done:
Reliably: Θ (δ) time


Using only reliable tests ⇒ too expensive: O (δ log(n/δ) + δ)

F. Silvestri (UniPD) Resilient Data Structures DS Meet 38 / 78

How to pay only an additive overhead?

A lazy approach

typically asleep:
trust unreliable variables. . .

from time to time wake up:

do some check

F. Silvestri (UniPD) Resilient Data Structures DS Meet 39 / 78

An O log n + αδ 2 algorithm

Rounds of at most δ search steps

starting checkpoint node x

target node ∈ tree(x)

ending checkpoint node y

target node ∈ tree(y )

F. Silvestri (UniPD) Resilient Data Structures DS Meet 40 / 78

Round structure

Unreliable phase
unreliable search steps +
unreliable consistency checks

No inconsistenty All check

Failing succeed
check
Checkpoint

Failing check

Reliable phase
reliable search steps+
reliable consistency checks

F. Silvestri (UniPD) Resilient Data Structures DS Meet 41 / 78

The unreliable phase

Perform (at most) δ unreliable search steps starting from x

(use only the first copy of each variable)

1 Let v = current node

2 If v = target node, go to the checkpoint
3 Otherwise, follow left/right pointer (let w = new node)
4 Check whether:
the address of node w is valid
w descendant of v
target node ∈ tree(w )
5 If any consistency check fails, start the reliable phase from x

F. Silvestri (UniPD) Resilient Data Structures DS Meet 42 / 78

The checkpoint

Perform the following reliable checks

(use all the 2δ + 1 copies of each variable)

1 Let x = starting checkpoint node

2 Let y = node on which the unreliable search terminated
3 If y = target node, then stop
4 If y descendant of x and target node ∈ tree(y ) ⇒ start new round
from y (search direction is correct)
5 Otherwise: start the reliable phase from x

F. Silvestri (UniPD) Resilient Data Structures DS Meet 43 / 78

The reliable phase

Perform δ reliable search steps starting from the checkpoint node x

(use all the 2δ + 1 copies of each variable)

1 Let v = current node

2 If v = target node, then stop
3 Otherwise, follow the left/right pointer

F. Silvestri (UniPD) Resilient Data Structures DS Meet 44 / 78

Search analysis

Rounds terminate after:

Unreliable phase + checkpoint:
cost O (δ)
Reliable phase:
cost O δ 2

Unsuccessful rounds:
go down the tree < δ levels
IDEA: Charge the time spent in reliable phases and unsuccessful rounds to
faulty values

F. Silvestri (UniPD) Resilient Data Structures DS Meet 45 / 78

Search analysis 2

Successful rounds:

O (log n/δ) such rounds ⇒ O (log n + δ) total time

Reliable phases:

Take place only if a check fails ⇒ A node at distance ≤ δ from the

starting checkpoint x contains some faulty value
At the end of the phase, such faulty value is out of the subtree in
which the search continues

At most δ faulty values ⇒ O αδ 2 total time

Unsuccessful rounds:

similar reasoning

F. Silvestri (UniPD) Resilient Data Structures DS Meet 46 / 78

Inserting a key

insert(γ)
Find the target node v and add
O log n + αδ 2
the key to its list of keys
If the number of keys becomes 2δ O (δ)
Delete node v O (δ log n)
Split the interval I (v ) into two
subintervals L and R such that
L ∪ R = I (v )
L takes the δ smallest keys
of v O δ2

R takes the δ largest keys

of v

Add two new nodes with intervals

O (δ log n)
L and R to the search tree

F. Silvestri (UniPD) Resilient Data Structures DS Meet 47 / 78

Insert analysis

The cost O (δ log n) can be amortized over Ω (δ) operations:

The new nodes contain δ keys each ⇒ the threshold 2δ can be
reached again after at least δ insertions
Similarly when we have deletions: Ω (δ) operations are necessary to
reach the threshold δ/2

Total amortized time

δ log n
O(log n + αδ 2 + ) = O log n + αδ 2

| {z } δ }
node search
| {z
tree update

F. Silvestri (UniPD) Resilient Data Structures DS Meet 48 / 78

Further improvements

[Finocchi et al., 2009]

O log n + δ 1+ amortized for any constant > 0

O (log n + δ) expected amortized time

F. Silvestri (UniPD) Resilient Data Structures DS Meet 49 / 78

RESILIENT PRIORITY QUEUE

F. Silvestri (UniPD) Resilient Data Structures DS Meet 50 / 78

Priority queue

Operations
insert(x): insert a new entry x
deletemin(): return the smallest uncorrupted value or a corrupted
value

Priority queue from previous tree: O log n + αδ 2 amortized

From [Jørgensen et al., 2007]

O (log n + δ) amortized per operation
Based on the cache-oblivious implementation in [Arge et al., 2002]

F. Silvestri (UniPD) Resilient Data Structures DS Meet 51 / 78

Structure (1)

Di D i+1

I ... Ui Ui+1 ...

b si si+1
Li L i+1

Insertion buffer I
k = O (log n) layers Li
Layer Li contains two buffers Di and Ui
Buffers are implemented as circular arrays

F. Silvestri (UniPD) Resilient Data Structures DS Meet 52 / 78

Structure (2)

Di D i+1

I ... Ui Ui+1 ...

b si si+1
Li L i+1

Buffers are double linked

The links between components and their sizes are stored resiliently
Buffers Di contain small entries that are moving down
Buffers Ui contain large entries that are moving up

F. Silvestri (UniPD) Resilient Data Structures DS Meet 53 / 78

Structure (3)

Di D i+1

I ... Ui Ui+1 ...

b si si+1
Li L i+1

Invariants
Each buffer (but I ) is faithfully ordered ←− Correct values are sorted
Di Di+1 and Di Ui+1 are faithfully ordered
|I | ≤ b ←− b = δ + log n + 1
←− si = 2si−1 = 2i δ 2 + log2 n

si /2 ≤ |Di | ≤ si , 0 ≤ i < k
|Ui | ≤ si /2 0 ≤ i ≤ k

F. Silvestri (UniPD) Resilient Data Structures DS Meet 54 / 78

Insert

insert(x)

1 Append x to I

2 if |I | > b
1 Move all entries of I into U0
2 Resiliently sort U0
3 If |U0 | > s0 /2, invoke push(U0 )

F. Silvestri (UniPD) Resilient Data Structures DS Meet 55 / 78

Delete min

First, find the min

1 Find the min in I

2 Find the min of the first δ + 1 elements of U0
3 Find the min of the first δ + 1 elements of D0
4 The min is the smallest among the three values

Then, delete the min

1 Remove the min from the appropriate buffer

2 Right shift all the elements in the affected buffer from the beginning
up to the position of the minimum
3 If min was in D0 and now |D0 | < s0 /2 invokes pull(D0 )

F. Silvestri (UniPD) Resilient Data Structures DS Meet 56 / 78

Push

push(Ui )
(Invoked when |Ui | > si /2)

If Li is not the last layer

1 Merge U , D and U
i i i+1
2 Assign the first |D | − δ entries to a new buffer D 0
i i
3 Assign the remaining entries to a new buffer U 0
i+1
4 Set U = ∅
i Di = Di0 0
Ui+1 = Ui+1
5 If |U
i+1 | > si /2 invoke push(Ui+1 ) recursively

For each Di where |Di | < si /2 invoke pull(Di )

In the cache-oblivious implementation Di0 receives

|Di |

F. Silvestri (UniPD) Resilient Data Structures DS Meet 57 / 78

Faulty push

Di = {1, 2, 3} Di+1 = {4, 5, 6}

Ui = {100, 101} Ui+1 = {}
Merge {1, 2, 3} and {100, 101}
Di = {1, 2, 3} Di+1 = {4, 5, 6}
Ui = {} Ui+1 = {100, 101}
Memory fault: 3 → 200

Merge {1, 2, 200} and {100, 101}

Di = {1, 2, 100} Di+1 = {4, 5, 6}
Ui = {} Ui+1 = {101, 200}
Di and Di+1 are not faithful ordered since 100 has not been corrupted!

F. Silvestri (UniPD) Resilient Data Structures DS Meet 58 / 78

Push (again)

If Li is the last layer

1 Merge Ui , Di
2 Assign the first |Di | entries to a new buffer Di0
3 0
Assign the remaining entries to a new buffer Di+1
4 Set Di = Di0 0
Di+1 = Di+1 Ui = Ui+1 = ∅

F. Silvestri (UniPD) Resilient Data Structures DS Meet 59 / 78

Pull

pull(Di )
(Invoked when |Di | < si /2)
If Di is not the last layer
1 Merge D , D
i i+1 and Ui+1
2 Assign the first s entries to a new buffer D 0
i i
3 Assign the next |D 0
i+1 | − (si − |Di |) − δ to a new buffer Di+1
4 Assign the remaining values to a new buffer U 0
i+1
5 Set D = D 0 D = D 0 U = U 0
i i i+1 i+1 i+1 i+1
6 If |D
i+1 | < si /2 invoke pull(Di+1 ) recursively

If Di is the last layer: nop

If |Ui+1 | > si /2 invoke push(Ui+1 ) recursively

0
In the cache-oblivious implementation Di+1 receives
|Di+1 | − (si − |Di |)
F. Silvestri (UniPD) Resilient Data Structures DS Meet 60 / 78
Complexities

Space: O (n)
Keys not replicated
Ω (δ + log n) keys per level (but I and Lk )
O (δ) pointers per level
O (n + δ) space
δ term can be removed by exploiting safe memory
Time/operation: O (log n + δ) amortized
Each layer is updated after O (si ) operations

F. Silvestri (UniPD) Resilient Data Structures DS Meet 61 / 78

Resiliency
&
Cache-obliviousness

F. Silvestri (UniPD) Resilient Data Structures DS Meet 62 / 78

I/O-efficiency

Faulty RAM has one memory level

Modern platforms feature memory
hierarchies
Reducing I/O improves performance ⇒
exploit locality
Caches (SRAM) even more sensitive to
memory faults
Low supply voltage, low critical charge per
cell
ECC prohibitive: tight constraints on die
size and speed

F. Silvestri (UniPD) Resilient Data Structures DS Meet 63 / 78

Fault tolerance vs I/O-efficiency

Hierarchical faulty memory model

[Brodal et al., 2009a]

Two memory levels (memory and cache)

Cache size M, block length B
Both levels can be faulty
I/O resilient algorithms for: sorting, dictionary,
priority queue

Algorithms are cache-aware: crucially depend on memory parameters

⇓
reduced portability

F. Silvestri (UniPD) Resilient Data Structures DS Meet 64 / 78

Fault tolerance vs cache-oblivious
Cache-oblivious algorithms overcome the issue [Frigo et al., 1999]
no explicit dependency on memory parameters
adapt automatically to all memory levels
optimality on a two-level hierarchy implies optimality on an arbitrary
hierarchy

Question
Can we design algorithms that are fault-tolerant and cache-oblivious?

Cache-oblivious algorithms are designed in a flat model (faulty-RAM),

but executed on the hierarchical faulty memory model
P private memory
if P = Θ (1): private memory may be implemented in the CPU registers
if P = ω (1): private memory hierarchy whose largest level has size P
Misses due to private memory are negligible in our algorithms.
Cache-oblivious algorithms don’t use M and B, but may use δ and P
F. Silvestri (UniPD) Resilient Data Structures DS Meet 65 / 78
Resilient cache-oblivious algorithms

[Caminiti et al., 2011] shows how to derive resilient cache-oblivious

algorithms for many problems
Local-dependency dynamic programming
Edit distance
Longest common subsequence
Gaussian Elimination Paradigm
All-pairs shortest path
Matrix multiplication
Gaussian Elimination Without Pivoting
Fast Fourier Transform

F. Silvestri (UniPD) Resilient Data Structures DS Meet 66 / 78

Edit distance. . .

We focus on a case of local dependency dynamic programming

Case study
Computing the edit distance (ED) of two strings:
We show how to derive a resilient cache-oblivious algorithm for ED
using P private memory

Similar techniques applies to GEP and FFT

F. Silvestri (UniPD) Resilient Data Structures DS Meet 67 / 78

First, some notation. . .

r -resilient variable x
Write 2r + 1 copies
Read by majority (in O (1) safe memory)
At least r + 1 faults are required to corrupt x
An adversary can corrupt at most bδ/(r + 1)c r -resilient variables

Rabin fingerprint ψA of a vector A = ha0 , a1 , . . . , an−1 i

n−1
X
ψA = ai 2w (n−i−1) mod p
i=0

p prime number, w memory word size

Can be computed with a scan of A and O (1) space
If entries are not accessed in order, fingerprints may require O (n log n)
due to exponentiation

F. Silvestri (UniPD) Resilient Data Structures DS Meet 68 / 78

Running example: ED

Edit distance
Input: strings X = x1 , . . . xn , Y = y1 , . . . yn .
Output: their edit distance

Edit − Distance(X , Y ) = number of edit ops {ins, del, sub} required to

transform X into Y
DP table for ED: (n + 1) × (n + 1) table, given by the following
recurrence:

 i +j if i = 0 or j = 0
`[i, j] = `[i − 1, j − 1] if i, j > 0 and xi = yj
1 + min{`[i, j − 1], `[i − 1, j]} if i, j > 0 and xi 6= yj


The ED is `[n, n]
O n2 running time