Sawja Card
Sawja Card
Inria
1 Introduction
Security plays a prominent role in the smart card industry, due to their exten-
sive use in banking and telecommunication. Hence, certification of smart cards
has become accepted industrial practice. Traditional certifications (e.g., against
the Common Criteria [1]) focus primarily on the protection mechanisms of the
card’s hardware and operating system. More recently, attention has been drawn
to the increasing number of applications that execute on the cards and the smart
card industry has elaborated a set of secure coding guidelines [20,12] that apply
to basic applications. Basic applications are granted limited privileges and the
goal of the guidelines is to ensure that they do not interfere with more sensitive
(e.g., banking) applications. The verification of these guidelines is done by an
independent authority that analyses the code and issues a certificate of confor-
mance (or pinpoints failures of compliance). In collaboration with a company
from the smart card industry we have developed the static analysis tool Saw-
jaCard that can significantly simplify and automate the validation of smart card
basic applications.
We consider applications written in the Java Card language – a dialect of
Java dedicated to smart cards. To be validated, an application must respect a se-
ries of secure coding rules. SawjaCard is designed for the certification procedure
proposed by AFSCM, the French industry association for Near Field Communi-
cation (NFC) technology providers, which consists of around 65 rules in total.
These rules impose requirements on how an application is programmed, how it
uses the resources of the card, and what kind of exceptions are acceptable.
Our main contribution is the implementation of the first static analysis tool
able to automate the validation of basic applications according to AFSCM rules.
Our experiments show that SawjaCard proves automatically 87% of the proper-
ties. This work also demonstrates that precise but informal security guidelines
can be mapped to formal properties that can be checked by harvesting a static
analysis result. The design of the static analysis engine is a contribution of its
own: we exploit the characteristics of Java Card but also constraints imposed
by guidelines to get a precise yet efficient analyser. In terms of static analysis,
the main contribution is a novel abstract domain for identifying a variant of the
object-oriented singleton object pattern, where the nullity of a field is used to
control the execution of an allocation instruction (Section 4.2).
We first present Java Card, highlighting the features relevant for security val-
idation such as the Java Card firewall. The security requirements are described,
and we explain how they can be verified on the model obtained through static
analysis. We then present the main features of the analysis engine that is at the
core of the tool. This includes the above-mentioned domain for detecting single-
ton objects and the use of trace partitioning for identifying file accesses. The tool
has been evaluated against a series of industrial applications. We explain how
the tool has significantly improved the certification procedure by automating a
large part of the code verification.
2 Java Card
This precise modelling of the Firewall is necessary to rule out security exceptions.
Needless to say that the validation of applets strictly forbids security exceptions.
Figure 2: Classification of the rules (from AFSCM [20] and Global Platform [12]).
restrictions on how library methods can be called, and with what arguments.
Most of these rules cannot be verified as is, exactly due to the undecidability of
the underlying semantic property, and approximations are called for. As men-
tioned above, an important feature of these rules is that certain rules simplify the
verification of others. E.g., knowing that the rules “Recursive code is forbidden”
and “Arrays must be allocated with a determined size” are satisfied means that
the analyser can strive for more precision without running into certain complex-
ity and computability issues. In the following, we explain how the validation of
the rules can be done by mining the static analysis result.
Numeric values: In Java Card, resources are often accessed using integer
identifiers and managed by calling methods with a fixed set of flags. Many rules
specify that those integers must be constant or range over a set of legitimate
values. Our analyser is computing an abstraction of numeric values and therefore
of method arguments. The abstraction of the method arguments is checked for
compliance with the rules.
Array values: Some resources can be coded by arrays of integers. For in-
stance, files are identified by an array [i1 ;...;in ] which represents the file
name i1 /.../in . Menu entries (i.e., strings) are coded by arrays of bytes.
As with numeric values, validation rules often require those arrays to be con-
stant. Files are an exception. File names are constructed by a sequence of calls
[Link](d1 ) . . . [Link](dn ) where fh is a FileView object and the di
are typically constants, identifying directories. Our analyser does not track se-
quences of events but our model of the class FileView is an array representing
the current working directory that is updated by calls to the select method.
Our analyser models arrays and provides for each index a numeric abstraction of
the content. This abstraction is queried in order to validate rules about resources
encoded as arrays.
Control-flow: The validation rules impose constraints on the control-flow
graph of the application—especially during the installation phase. For instance,
most memory allocations are required only to take place during the install phase,
identified by a call to the install method. The analysis is constructing an ab-
stract control-flow graph corresponding to the inlined control-flow graph of the
application. Constraints over the control-flow graph can therefore be checked by
exploring the abstract control-flow graph. For the particular case of memory al-
location, we traverse the graph and make sure that memory allocation primitives
e.g., new statement, are only accessible from the install method.
Exceptional behaviour: Validation rules are strict about exception han-
dling. Run-time exceptions such as ArrayOutOfBounds, NullPointerException
and ClassCastException are strictly forbidden. In our bytecode intermediate
representation, run-time exceptions correspond to explicit instructions and we
generate verification conditions for all those instructions. For obvious reasons,
security exceptions (SecurityException) are also forbidden. The abstraction of
the heap is designed to model object ownership and can be used to ensure that
the security checks performed by the Java Card Firewall do no raise Securi-
tyException. There are other rules about exceptional behaviours which can be
interpreted as coding guidelines. The analysis is precisely modelling the flow of
exceptions. In particular, it collects for each handler the caught exception and
for each method call the escaping exceptions. This information is sufficient for
checking all the rules regarding exceptions.
Our static analysis engine is designed specifically for Java Card and its partic-
ular programming style. Existing general purpose analysis frameworks for Java
e.g., [32,19,17] cannot be applied directly to Java Card. Firstly, existing frame-
works do not provide a CAP front-end – this is a non-negligible engineering issue.
Although CAP files are compiled from class files, the inverse transformation is far
from obvious. For instance, the instruction set is different and dynamic method
lookup is compiled using explicit virtual tables. Secondly, our static analysis
engine exploits fully the fact that Java Card programs are relatively small, for-
bid recursion and allocate few objects. Standard Java analyses designed to scale
for object-oriented programs cannot exploit this. Finally, the Java Card firewall
which has no Java counterpart is also modelled directly at the analysis level.
Our analyser operates on a 3-address code intermediate bytecode represen-
tation A3Bir [7] that is obtained by pre-processing the binary CAP file. This
representation is adapted from the Sawja framework [17] and has the advan-
tage of making explicit the different runtime checks performed by the Java Card
Virtual Machine. An example of such intermediate code is given Fig. 4.
The static analysis engine implements an inter-procedural scheme which
consists in a dynamic inlining of method calls. The benefit is a precise inter-
procedural scheme that mimics the behaviour of a concrete interpreter. In terms
of abstract domains, the domain of the inter-procedural analysis is D∗ given
that D is the domain for the intra-procedural analysis. This approach is effec-
tive for two reasons that are specific to Java Card: recursion is forbidden and
the programs are relatively small.
would still be unable to infer that at Line 6 the value of i can only be 5. To get
this result, it is necessary to propagate through the test j==0 the knowledge that
j equals i. This is a known weakness of non-relational domains which compute
an abstraction for each variable independently. There are well-known numeric
relational domains e.g., convex polyhedra [6], octagons [24]. These domains are
very expressive but are also computationally costly. Our analyser is using a more
cost-effective weakly relational domain [25] computing for each program point
and each local variable x an equality x = e where e is a side-effect free expression
of our intermediate representation i.e., an expression built upon variables, arith-
metic operators and field accesses. At Line 5, we have the equality j==i. Hence,
when j==0, i is also 0 and when j != 0, j have value 5 and so has i. Combined,
the three domains are able to compute the precise invariant of Line 6.
Symbolic expressions improve the precision of numeric abstractions but also
significantly contribute to ruling out spurious null pointers. This is illustrated
by Fig. 4. Our goal is to verify that bar is called with a non-null argument. At
source level, this property is obvious. However, the cumulative effect of Java com-
pilation, CAP conversion and the re-construction of our analyser intermediate
representation introduces temporary variables. Without symbolic expressions,
at Line 3, we only know that the temporary variable t0 is not null but this vari-
able is not used anymore. Symbolic expressions keep track of equalities between
variables t0, t1, t2, the constant null and the value of field [Link]. Using
the theory of equality, we can deduce at Line 2 that [Link] is not null. This
information is propagated at Line 4 where we deduce that t2 is also not null.
Therefore, at the call in Line 5, t2 is still not null.
Fig. 5 illustrates how the reduced product [4] of a points-to, Sgton and Cnt
analyses can ensure the singleton property. Before the loop (Line 1), the object
o is not allocated, the field fd is definitely null and the conditional property
(fd = null ⇒ (o 7→ 0)) holds. At the head of the loop (Line 4), the object o is
a singleton, the field fd is either null or points to the object o. The singleton
domain holds the key invariant: the object o is not allocated when the field fd
is null. After the test fd == null (Line 6), we refine our points-to and conclude
that fd is definitely null. Therefore, the conditional property of the singleton
domain holds: we can exploit the right hand side of the condition and refine the
1 /* fd = null & (fd = null ⇒ (o 7→ 0)) & o 7→ 0 */
2 [...]
3 while(true){
4 /* fd ∈ {null, o} & (fd = null ⇒ (o 7→ 0)) & o 7→ 1 */
5 if (fd == null)
6 /* fd = null & (fd = null ⇒ (o 7→ 0)) & o 7→ 0 */
7 fd = new o();
8 /* fd = o & (fd = null ⇒ ⊥) & o 7→ 1 */ }
Trace partitioning [27] is a generic technique for locally improving the precision
of a static analysis. It consists in partitioning an abstract state depending on
a history of events. Suppose that the original abstract state is D] , after trace
partitioning, the abstract state is (Event ∗ × D ] )∗ where Event is an arbitrary set
of syntactic events (e.g., a call to a specific method) or semantic events (e.g., the
variable x has value v). We have successfully used trace partitioning for precisely
determining the files accessed by an application.
As explained in Section 3, Java Card comes with a hierarchical file system.
In our model, the current directory i1 /.../in is coded by an array of short
[i1 ;...;in ] that is stored in the path field of a file handler object fh im-
plementing the FileView interface. Moving to the in+1 directory is done by the
method call [Link](in+1 ). Therefore, determining the accessed files requires
a precise analysis of the array content.
Consider the code of Fig. 6 that is representative of how files are accessed in
Java Card. Suppose that before calling the cd method, the field fh is either null
or points to an object ofh such that ∀i, [Link][i] = 0. At the method return,
with our base abstraction, the effect of the three paths is merged. We loose preci-
sion and get res = fh ∈ {null; ofh}∧[Link][0] ∈ [0; 1] ∧ [Link][1] ∈ [0; 20].
However, the precise post-condition of the cd method is P1 ∨ P2 ∨ P3 where each
1 public static FileView cd(){
2 if (fh!=null){
3 [Link]((short)1);
4 if(RANDOM_BOOL()){return null;}
5 [Link]((short)20); }
6 return fh; }
6 Experimental evaluation
We have evaluated SawjaCard on 8 industrial Java Card applets. For confiden-
tiality reasons, we are required to keep them anonymous. The applications are
representative of basic applications. There are loyalty applets but also phone
applications. They require few privileges but access nonetheless certain non-
sensitive part of the file system. The characteristics of the applets can be found
in Fig. 7. For each applet, we provide the number of instructions of the appli-
cation and the number of instructions taking into account the libraries used by
Applet A1 A2 A3 A4 A5 A6 A7 A8
Instrs (app) 2769 2835 1823 1399 636 752 1245 230
Instrs (+ libs) 5824 5236 4301 5643 2834 3044 3402 2040
CFG 3435 6096 1491 1247 825 999 842 487
Time 29min 19min 6min 2min 32s 18s 4s 2s
simple rewrite of the code would also resolve the problem. Other array accesses
rely on invariants that are not available to the analyser. For instance, certain
array indexes are read from files. More precisely, when reading a file, certain
applications first read a special segment, which is the file status. The full size of
the file is a field of this file status. As the content of the file cannot be known,
it is impossible to track this length.
CatchNonISOException4 All the applets trigger alarms for this property. The
alarms correspond to violations of the property. The exceptions that are ignored
correspond to exceptions that are not thrown by the application itself but escape
from library code. It might very well be that the proprietary implementations
3
From Global Platform rules: The Application should catch each exception defined
by the used APIs individually in the application code and should explicitly rethrow
the exception to the card runtime environment if needed.
4
From AFSCM rules: All exceptions thrown during the execution from any defined
entry point must be caught by the application, except ISOException that are thrown
in response to a command.
never raise these exceptions. Nonetheless, their possibility is reflected by our
model of the API which is based on the Java Card API specification.
Other properties. The AFSCM rules forbid the classic Java exceptions: Class-
CastException, NegativeArraySize and ArrayStoreException. For all the applets,
SawjaCard proves their absence. Thanks to our modelling of Java Card Firewall,
SawjaCard is also able to rule out SecurityExceptions. The rule AppletInStat-
icFields specifies that applet objects should not be stored in static fields. This
property is validated for all the applets. The next two rules concern values that
should be constant: array sizes and menu entries. Those rules are also vali-
dated for all the applets. The rule SDOrGlobalRegPriv is about privileges that
should be granted to access certain APIs. Applet 1 requires certain privileges
and therefore raises an alarm. The rules SWValid and ReplyBusy specify the
range of the status word return by applets. The rule is verified for all the applets
except applet 1. This is probably a false alarm given that the applet is using a
non-standard way of computing the status word. The last rule concerns certain
method calls returning handlers that should be protected by try-catch blocks.
SawjaCard raises an alarm for all the applets. This rule is indeed violated.
DeadCode5 For all the applets, SawjaCard detects some dead code which is due
to the Java compilation. Consider the following method which unconditionally
throws an exception. The return instruction is not reachable but is required by
the Java compiler.
1 void dead_code (short val){ [Link](1); return; }
The Java compiler also enforces that method should list the exceptions they
might raise using a throws clause. However, the algorithm for checking this clause
is purely syntactic. To make Java Card compile, a defensive approach consists
in adding handlers for all the potential exceptions. For certain calls, SawjaC-
ard proves that certain exceptions are never thrown and that the handlers are
therefore dead code. For compliance with the rule, a workaround would be to
remove the useless handlers and add to the throws clause of the method all the
exceptions that are proved impossible.
Allocation7 The alarms are real violations. Most applets allocate objects af-
ter the install phase. Yet, more relaxed rules allow the allocation of singleton
objects. This rule is still violated by applet 6 which repeatedly tries to get a han-
dler. In our model of the library, each unsuccessful try allocates an intermediate
object and is therefore responsible for a memory leak. For the other applets, our
singleton domain is precise and ensures that memory allocation is finite.
7 Related work
For analysing Java programs, there are mature static analysis frameworks such
as Soot [32] and Wala [19]. Based on Wala, the Joana tool [13] is able to prove
security properties based on information-flow. Information-flow analyses would
probably benefit from the Java Card restrictions. Currently, AFSCM guidelines
do not consider such properties and are limited to safety properties.
Algorithms tuned for Java are usually not well-fitted for the constraints of
Java Card. In particular, state-of-the-art algorithms for constructing control-flow
graphs of Java programs are based on context-sensitive flow-insensitive points-
to analyses [22,29]. For Java Card, our analyser demonstrates that a context-
sensitive flow-sensitive points-to analysis is viable. It dynamically inlines meth-
ods calls and therefore literally computes an ∞-CFA. The Astree analyser is
using a similar strategy for handling function calls [5]. In their context, the
programs are large and function calls are rare. Java Card programs are compar-
atively tiny but method calls are ubiquitous.
For Java, Hubert et al., [16] show how to infer the best @NonNull annota-
tions for Fähnrich and Leino type system [11]. The static analyser Julia [30,31]
implements a more costly but also more precise null pointer analysis that can
be efficiently implemented using BDDs. Because our objects are singletons, our
flow-sensitive points-to analysis performs strong updates and is therefore precise
enough to precisely track null pointers and rule out NullPointerExceptions.
Might and Shivers [23] show how to improve abstract counting of objects
using abstract garbage collection. Their analysis can prove that an abstract
object corresponds to a single live concrete object. Our singleton domain is
based on a different program logic and can ensure that an abstract object is
only allocated once. As Java Card usually does not provide garbage collection,
we really need to prove that there are only a finite number of allocated objects.
Semantics [28,9] and analyses [14,8] have been proposed for Java Card. Huis-
man et al., [18] propose a compositional approach to ensure the absence of
illicit applet interactions through Shareable interfaces. For basic applications
7
From Global Platform rules: A basic application should not perform instantiations
in places other than in install() or in the applet’s constructor.
such interactions are simply forbidden. Our tool verifies that applets do not ex-
pose Shareable interfaces and therefore enforces a simpler but stronger isolation
property. A version of the Key deductive verification framework [2] has been
successfully applied to Java Card [26]. JACK [3] is another deductive verifica-
tion tool dedicated to Java Card that is based on the specification language
JML [21]. However, deductive verification is applied at the source level and re-
quires annotations of the code with pre-(post-)conditions. This methodology is
not applicable in our validation context which needs to be fully automatic for
binary CAP files.
8 Conclusions
The validation process for smart card applications written in Java Card involves
around 55 rules that restrict the behaviour of the applications. This process can
benefit substantially from static analysis techniques, which can automate most
of the required verifications, and provide machine-assistance to the certifier for
the rest. The SawjaCard validation tool contains a static analysis which com-
bines analysis techniques for numeric and heap-based computations, and which
is further enhanced by specific domain constructions dedicated to the handling
of the file system and Java Card firewall. A substantial part of building such a
validation tool involves the modelling of libraries for which we propose to build
a series of stubs whose behaviour approximates the corresponding APIs suffi-
ciently well for the analysis to be accurate. Benchmarks on a series of industrial
application shows that the tool can analyse such applications in a reasonable
time and eliminate more than 80% of the required checks automatically.
The development of the tool suggests several avenues for further improve-
ments. The properties for which the tool could be improved are ArrayOutOf-
BoundException and file properties. The numeric analysis is only weakly rela-
tional, and it would be possible to increase its precision by using a full-blown
relational domains such as polyhedra or octagons. An effective alternative to
significantly reduce the number of alarms would be to impose stricter coding
rules (for example defensive checks for narrowing down the range of non con-
stant indexes). Our model of the file system could also be improved. To get a
precise and scalable analysis, our assessment is that file system specific abstract
domains should be designed. Certain properties are also simply not provable
because they depend on invariants that are established by the personalisation
phase of the application. This phase happens after the install phase and corre-
sponds to commands issued, in a secure environment, by the card manufacturer.
Currently, the end of this phase has no standard specification and cannot be
inferred from the applet code. For the others properties we have satisfactory
results: when the tool emits an alarm, it corresponds to a real error in the appli-
cation. The tool has been recently transferred to industry where it will be used
as part of the validation process.