Description: ``A
bank teller has a vault that must be opened every day. The bank employs
three senior tellers, but they do not want to trust any individual with
the key combination. Hence, they would like a system whereby any two of
the three senior tellers can gain access to the vault.'' (example by Gopalakrishnan
and Stinson, "Applications of designs to cryptography", The CRC Handbook
of Combinatorial Designs, 1996).
In general, partial information (shadows) should be shared among w
participants,
so that any t participants together can compute
a key to a secret information, but no group with t-1
participants can. A (t,w,v)-threshold
scheme is a way of distributing v shadows
in this way, and the threshold scheme is perfect if it has the additional
property that no group of t-1 participants
can conspire to determine any partial information regarding the key.
The bank teller example requires a (2,3,v)-threshold
scheme for suitable v (the larger the number
of possible shadows, the larger the number of possible keys). We will look
at the construction of a perfect (3,3,v)-threshold
scheme for arbitrarily large v. This threshold scheme has v
possible shadows and v-2 keys; as described
above, the
3 people can put their shadows
together to discover the key, but if 2 people put their shadows together,
any of the
v-2 keys are equally likely to
be the secret key.
In this project, you are going to study threshold schemes in general
and implement a particular one, namely a (3,3,v)-threshold
scheme based on a combinatorial construction by Wilson (see: Stinson and
Vanstone, "A combinatorial approach to threshold schemes", SIAM J. Disc.
Math. 1 (1988)). You are going to write two main tools. The first one will
produce a complete (3,3,v)-threshold
scheme, given a suitable v. The second tool
will efficiently calculate the key, given 3
shadows and suitable v, in average time O(1).
The construction uses arithmetic module p (for p a prime number), so you
will have to implement basic arithmetic in a finite field of prime order.
Number of students required: 1 or 2
Additional information: You will need
some basic algebra knowledge in order to implement arithmetic over a finite
field of order p (F_p) and understand Wilson's construction. You will write
a C++ class for handling basic operations over F_p. Operations should be
carefully implemented in order to guarantee the O(1) average time for key
calculation. Extense bibliography on the topic can be found online
. We will provide background reading. Regular meetings with supervisor
will be scheduled.
Description: The
disk array architecture organizes many independent small disks into one
large logical disk. Suppose each part (some bits) of a piece of information
is stored in each of the physical disks. In order to tolerate disk failures,
redundancy must be incorporated. Thus, some extra disks are added to store
redundant information so that lost information can be quickly retrieved
and the contents of the erased disks can be reconstructed. This forms the
so-called RAIDs (redundant arrays of inexpensive disks). Error-correcting
codes could be used but erasure-resilient codes provide better solutions
for this problem (there is a tradeoff between reliability and check-disk
overhead/update penalty). It has been shown that certain combinatorial
designs can be used to construct the matrix defining optimal erasure-resilient
codes.
In this project, you will design and experiment with algorithms for
generating certain triple systems (you will learn what these animals are!)
for the construction of erasure-resilient codes for RAIDs. The general
construction method to be used is called hill-climbing, which is a general
randomized search algorithm. Care should be taken in the use of efficient
data structures for this particular problem. Different heuristics should
be tried and compared through experiments.
You will spend some time understanding the disk array application and
the use of codes and combinatorial designs in this application. Then you
will learn the general framework of the hill-climbing algorithm, and work
on a careful implementation of this algorithm. A lot of time will be devoted
to the design of this algorithm, its implementation and experimenting with
variations.
Number of students required: 1 or 2
Additional information: Basic linear algebra
knowledge is required for understanding linear codes. Good grasp of data
structures and practice in programming is required for the design and implementation
of the algorithm. We will provide background reading. Regular meetings
with supervisor will be scheduled.
Description: You
must design a bunch of tests for a program, so as to catch most errors
while keeping the number of tests not too high. Suppose the program
has 4 input parameters and each parameter can be assigned one of 2 possible
values. For example, a program computes life insurance premiums based
on the following parameters: gender (M/F), smoker (Y/N), age (young/old)
and serious disease among ancestors (Y/N). There is experimental evidence
that in many applications most errors occur from the interaction between
two parameters (for example, there is an error in the program when calculating
premiums for non-smoker males regardless of the other parameter values).
Therefore, our objective is to generate the minimum number of tests,
while guaranteeing pairwise coverage of parameter values.
Example:
In the example above, testing all possible parameter combinations would
require 2^4 = 16 tests. But we can achieve pairwise coverage with only
5 tests:
test 1 | test 2 | test 3 | test 4 | test 5 | |
gender | male | female | female | female | male |
smoker | smoker | non-smoker | non-smoker | smoker | non-smoker |
age | old | young | old | young | young |
anc. disease | yes | yes | no | no | no |
In this project, you are going to write a tool
for generating a set of tests guaranteeing some desired coverage while
minimizing the number of tests. The input parameters for your program will
be the number of parameters and the number of values for each parameter;
the user will also specify the desired coverage (complete pairwise coverage
or coverage of specified pairs). We are going to experiment with 2 types
of algorithms: one is based on an integer programming (IP) formulation
for this problem (your program will create an integer programming model
which will be fed into a package for its solution) and the second will
be a greedy heuristic (GH) implemented by you. The IP algorithm will give
the minimum number of tests but it is expected to take a long time, while
the GH will be quick but may generate too many tests. The idea is to compare
both methods in terms of running time spent in the generation and quality
of solution (how big is the test set generated). If two students are involved
in this project, the experimental comparison can be done more professionally.
Number of students required:
1 or 2
Additional information: Experience
with programming and algorithms is required. You will need C or C++ in
order to communicate with the Integer Programming solver (Cplex).
Note that the problem of designing an efficient test generator is a hard
research problem!!! The objective of this project is to play with two possible
solution methods and see how far each method can go. In order to do a reasonable
comparison between the methods, the test problems should be carefully chosen.
So, we could have a much nicer project with 2 participants sharing the
programming tasks. I strongly encourage that the pair of students know
each other well and have previous experience working or studying together.
We
will provide background reading. Regular meetings with supervisor will
be scheduled.