The HAVEGE unpredictable random number generator

HArdware Volatile Entropy Gathering and Expansion

Hundreds of megabits per second 

Authors: André Seznec, Nicolas Sendrier

Contact: seznec@irisa.fr


Last Modification: July 3, 2002

HAVEGE ported to MacOS10 platforms

First version was dated from Jan. 14, 2002 click here to rapidly check the modifications/enhancements

o
HAVEGE motivations and principles
o
Currently supported architectures and operating systems
o
General overview of HAVEGE package
o
Using HAVEGE
o
Integrating HAVEGE in an application
o
Combining HAVEGE with classical PRNGs
o
How do I test unpredictability
o
Copyright Notice
o
DOWNLOAD: The HAVEGE generator package for Linux and Unix (except ITANIUM) (as a tar file)
o
DOWNLOAD: The HAVEGE generator package for Windows (generated under the Cygwin environment) (as a zip file)
o
The HAVEGE generator package for Windows (to be tested in Visual C++ applications)
o
DOWNLOAD: The HAVEGE generator package for ITANIUM/Linux platform (as a tar file)
o
DOWNLOAD: The HAVEGE generator documentation (this web page plus the associated research report)

HAVEGE motivations and principles

Most previous software algorithms for generating unpredictable, i.e, irreproducible uniformly distributed, random number sequences rely on entropy gathering from measuring unpredictable external events. The throughput of these generators are in the  range of  10-100 bits per second. This limits them to being used as seeds for pseudo-random generators.  Such unpredictable random number generators are needed  for cryptography.
 
Modern superscalar processors feature a large number of hardware mechanisms which aim at improving performance: caches, branch predictors, TLBs, long pipelines, instruction level parallelism, ... The state of these components is not architectural (i.e. the result of an ordinary application does not depend on it), it is also volatile and cannot be directly monitored by the user. On the other hand, every invocation of the operating system modifies thousands of these binary volatile states .

HAVEGE (HArdware Volatile Entropy Gathering and Expansion) is a user-level software unpredictable random number generator for general-purpose computers that exploits these modifications of the internal volatile hardware states as a source of uncertainty.

During an initialization phase, the hardware clock cycle counter of the processor is used to gather part of this entropy:  tens of thousands of unpredictable bits can be gathered per operating system call in average.

HAVEGE  combines on-the-fly hardware volatile entropy gathering with pseudo-random number generation.

The internal state of HAVEGE includes thousands of internal volatile hardware states and is merely unmonitorable. Therefore HAVEGE features a very high security level. HAVEGE also reaches an unprecedented throughput for a software unpredictable random number generator: several hundreds of megabits per second on current workstations and PCs.

 
More on the HAVEGE approach:
A. Seznec, N. Sendrier, "HAVEGE: a user level software unpredictable random number generator", postscript, pdf
A Powerpoint  (or pdf ) presentation is also available.

Currently supported architectures and operating systems

o Sun UltraSparc I and II under Solaris, referred to as UII in Unix/Linux package
o Sun UltraSparc III under Solaris, referred to as UIII in Unix/Linux package
o Intel Pentium III (Pentium II/Celeron also) under Linux, referred to as PIII in Unix/Linux package
o Amd Athlon (and Duron) under Linux, referred to as ATHLON in Unix/Linux package
o PowerPC G4 under MacOS10, referred to as G4 in Unix/Linux package
oIntel Pentium 4 under Linux, referred to as  P4 in Unix/Linux package
o Intel Pentium III (Pentium II/Celeron also) under Windows  referred to as PIII in Windows package
(should work on most  recent Windows versions, back to Win98, Windows NT,  ..)
o Amd Athlon (and Duron) under Windows referred to as ATHLON in Windows package
(should work on most  recent Windows versions back to  Win98, Windows NT,  ..)
o Intel Pentium 4 under Windows referred to as  P4 in Windows package
(should work on most  recent Windows versions back to  Win98, Windows NT,  ..)
o Intel Itanium under Linux referred to as  ITANIUM in Itanium package

General overview of the HAVEGE package

For the HAVEGE packages for Visual C++ first read this.
 
The HAVEGE package includes two executable files (per processor target) and the associated source and makefile files.
It also allows you to build your own unpredictable random number generator.

Executable files

o HAVEGE(ARCH) : the HAVEGE generator leverages the entropy of the branch predictor, the introduction cache and the L1 data cache.

 
o ProbInitState(ARCH) is provided to the user for allowing it to check the unpredictability of the internal variables of the random number generator after the initialization phase of HAVEGE(ARCH)

Assembly source files

For a given platform (ARCH), two (short) specific assembly files are provided: HardClock(ARCH).s and HardTick(ARCH).s.
The functions HardTick() and HardClock() respectively coded in those files both return the value of the hardware clock counter. In addition HardTick() monitors the differences between two successive reads of the counter in order to track the occurrence of an OS interrupt.

Large files HAVEGE(ARCH).s is also part of the package, since compilation time for getting it from HAVEGE.c may be quite long (a few minutes for UltraSparc III and Athlon).

Caution: recompilation for getting HAVEGE(ARCH).s should be done using the same compiler and compiler options as for getting the original HAVEGE(ARCH).s

C source files

oHAVEGE.c is the main file for HAVEGE generator.
It  mainly consists of two  functions WALKINIT and COLLECTRAND, used for respectively initializing the data structures of the generator and for collecting random numbers.
- walkinit (int UCPU, int Frequency) initializes the data structures  of the HAVEGE generator. UCPU is the approximate number of milliseconds each of the two phases in the initialization is run. Frequency should be the clock frequency of the processor. walkinit calls WALKINIT (int niter1, int niter2).
-COLLECTRAND(int *RESULT, int SIZE) collects SIZE random numbers in array RESULT.
The collected numbers are exclusive-ORed with the initial content of the array.
Caution: the size of the RESULT array should be slightly higher than SIZE (500 extra words is OK for currently supported platforms).
Two interface functions are also provided:
-probeWalk(int *Res) allows to dump the content of the main array used by the random generator.
-defaultwalkinit(int Frequency) initializes the data structure of the generator using walkinit  with a recommended value.

Structure of the HAVEGE.c file:

The HAVEGE.c file includes declarations and #define files (PersoArch.h, DefaultParameters.h  or MyParameters.h).
  • PersoArch.h consists of parameters specific to each processor architecture
  • DefaultParameters.h  (or MyParameters.h) contains (a lot of ) #defines and the declarations (and initializations for MyParameters.h) of the internal variables of the HAVEGE generator.
  • Three files are included  as bodies of the three main while loops in respectively functions WALKINIT (initHAVEGE1, initHAVEGE2) and COLLECTRAND (WalkHAVEGE).
    Those three files are organized the same way:
  • The skeleton of a loop iteration is unrolled 1024 times.
  • Each of these iterations is personalized through the #defines in DefaultParameters.h (or MyParameters.h): constants but also operations are personalized in each iteration
  • Foreach individual architecture, a limited number of the unrolled iterations are part of the code, the remainder are ignored through the use of  the C preprocessor:  the number of unrolled iterations was tuned to the specificities of each of the architectures (mainly size of I-cache).
  • The structure of the basic iterations in the three while loops are described here. Their action as entropy gathering are also described.
  • ogetunpredictablerand.c features interface functions to use HAVEGE in an application. See   here.
    oPRNGIN+HardClock.c and PRNGOUT.c are provided to allow the user to combine its own favorite PRNGs with HAVEGE. See  here.
    oPackTest.c  + matrix.c + dfft.c + special-functions.c feature functions for on-line analysis of HAVEGE outputs. The overall NIST statistical suite has been adapted (see NIST  statistical suite).  Our implementation of this suite is described here

    Your  own generator of unpredictable numbers

    Instead of using the HAVEGE(ARCH) generator from the package,  you may want to use a  generator of  unpredictable random numbers which
    parameters are unknown to any other user. The HAVEGE package allows this functionality:  you can  randomly personalize your generator
    when installing the HAVEGE package:
     
    on Linux/Unix systems type, go in the directory you have downloaded the package and type:
    setenv  Arch (ARCH);  setenv PLUSPRNG NOPRNG ; make MyGenerator
    on Windows systems, you must get the CygWin environment. Type:
    make -f Makefile.(ARCH) MyGenerator
    Be patient, this can take a few minutes.
    First a file named MyParameters.h is generated.  This file contains unpredictable random instantiations of  constant values, but also operators (+, -,  exclusive-or, directions of shifts, directions of tests) used  in HAVEGE.c. It also contains unpredictable random initializations of the internal variables used by HAVEGE.
    Second a HAVEGE generator named MyHAVEGE(ARCH) is generated.

    Why should I use a personalized unpredictable number generator ?

    The advantage of using  MyHAVEGE(ARCH) instead of HAVEGE(ARCH) is an extra level of security.
    If a possible  observer has no  access to your MyHAVEGE(ARCH) executable then he/she gets extra unknown parameters to guess.
    The size of this set  of parameters  is very large:
    more than 256 Kbits  for the initial states of the internal variables
    +  several  thousands of operator choices
    +  several thousands of 16 bits constants
    +  several hundreds of 5 bits constants.

    An HAVEGE generator with identical loop iterations

    In HAVEGE(ARCH), the constants used in the different iterations of the three while loops are different. This artefact should make predicting the sequence even harder. Anyway it is our thesis that the unpredictability of the generated numbers comes from the impossibility for an external observer to rebuild the internal volatile hardware states in the processor. Therefore, we provide the candidate cryptanalyst for HAVEGE with an HAVEGE version where all the unrolled iterations are exactly those described here.
    on Linux/Unix systems type, go in the directory you have downloaded the package and type:
    setenv  Arch (ARCH); setenv PLUSPRNG NOPRNG; make  UnifHAVEGE
    on Windows systems, you must get the CygWin environment. Type:
    make -f Makefile.(ARCH)  UnifHAVEGE
    Be patient, this can take a few minutes.
    First a file named UnifParameters.h is generated.
    Second a HAVEGE generator named UnifHAVEGE(ARCH) is generated.

    Using HAVEGE

    oCaution:   HAVEGE is processor dependent, then you have to  correctly select the processor and platform !!

    be cool, in practice running HAVEGEPIII on an Athlon or a Pentium 4 or HAVEGEATHLON on a PIII  or a P4 provides unpredictable random numbers,
    same applies for UltraSparc II and UltraSparc III,  but  we are not able to evaluate the  average hardware entropy introduced by each  OS  interrupt :-)
    oUsage : HAVEGE(ARCH) (or MyHAVEGE(ARCH))  or HAVEGE(ARCH) (or MyHAVEGE(ARCH))  Frequency UCPU step  SIZE FILENAME
    (if you omit all parameters you will be prompted for them)
    - (ARCH) is the name of the architecture
    Frequency: you must provide the frequency in Mhz of the target processor (don't worry, if you don't know it simply  overestimate it)
    UCPU:  you must provide the  (approximate) number of  user CPU milliseconds that  each of the  two phases in the initialization of the generator will run.
    The goal of the initialization phase is to set the internal variables of the generator in an unpredictable state. In practice UCPU must be tailored  in such a way that you are guaranteed that a mininum number nminint  of OS interrupts  will be encountered  in each of the two phases in the initialization phase of the generator.
     
    Minimum  nminint values  for achieving this unpredictable state on an unloaded machine are provided  in the tables below.
    The average volume of uncertainty that we were able to  empirically collect on unloaded machines from the L1 cache and the L1 I-cache (+ branch predictor) are also given in the tables.
     
     
    Processor/OS
    UII Solaris
    UIII Solaris
    PentiumIII Linux
    Athlon Linux
    PentiumIII Windows
    Athlon Windows
    minimum  nminint
    4
    4
    4
    16
    8
    recommended UCPU
    1000
    1000
    1000
    1000
    1000
    1000
    Average entropy per OS interrupt
       
     
       
     
    L1 data cache
    8K bits 
    8K bits 
    4K bits
    8K bits
    4 K bits
    4K bits
    L1 I-cache + branch predictor
    64K bits
    64K bits
    32K bits
    32K bits
    16K bits 
    32K bits
     
     
    Processor/OS
    Pentium 4 Linux
    Pentium 4  Windows
    Itanium
    minimum  nminint
    16
    8
    16
    recommended UCPU
    1000
    1000
    1000
    Average entropy per OS interrupt
         
    L1 data cache
    2Kbits
    2Kbits
    2Kbits
    L1 I-cache + branch predictor
    32Kbits
    32Kbits
    8Kbits
    On loaded machines,  it is possible that, in some circumstances,  less entropy would  be collected per OS interrupt.   We recommend to use a value of  UCPU  that  guarantees that  a high number of OS interrupts  will be encountered during the initialization phase. This should secure  the  randomness of the intitial content of the internal variables of the generator. Notice that the initialization phase still only consumes  2 CPU seconds on an unloaded machine.
     
    -step is the number of calls of the HAVEGE generator: on each call, the initial content of  a 32 Mbyte is updated by XORing with the output of the generator.  From our experiments: one call is sufficient to get unpredictable content .

    - if choice != 0 then choice Mbytes are generated and analyzed  online. tests)
    if choice = 0 then 32 Mbytes are dumped in a file.

    -FILENAME is the name of the result file: 32 Mbytes of random numbers are dumped in this file.  This is a binary file.
    In the absence of FILENAME as parameters, resbin is the result file.

    o ProbInitState(ARCH)  Frequency  UCPU  choice  (FILENAME)
    ProbInitState is provided to the user for allowing him/her to check the unpredictability of the internal variables of the random number generator after the initialization phase of HAVEGE(ARCH).
    if choice != 0 then choice Mbytes are generated and analyzed  online
    if  choice =0 then 16 Mbytes, i.e. 512  (1024 for Pentium 4) consecutive initializations of the internal  variables of HAVEGE ,  are dumped to FILENAME (or resbin).
    If you get experiments where ProbInitState(ARCH) fails to provide unpredictable numbers when using a value of UCPU higher than 100  then please  report

    Integrating HAVEGE in an application

    You can test HAVEGE in an application as follows.
    Just link your application with HAVEGE(ARCH).s (or HAVEGE.o if you have personalized your HAVEGE copy), HardTick(ARCH).s and HardClock(ARCH).s (and may be getunpredictablerand.c)
     
  • You can manage the random number generation by yourself  through the use of  walkinit (int  UCPU, int Frequency) or defaultwalkinit(int Frequency) and COLLECTRAND (int *RESULT, int SIZE). This management is at your own risk:  remember that internal variables of the generator must be initialized and that the size of the RESULT array must exceed SIZE by some margin. For current implementation, this margin is smaller than 500.  Also remember that HAVEGE uses  the volatile hardware states as its source of entropy, then it is preferrable to call COLLECTRAND with a large SIZE, in order to allow a few OS interrupts to occur during a single call.
  • Or you can use the two interface functions we provide in getunpredictablerand.c.  int getunpredictablerand(int Frequency ) returns a single 32 bits integer. getunpredictablerandarray (int Frequency, int *RESULT, int SIZE) returns SIZE unpredictable integers in array RESULT. These functions take care of all the details you do not want to deal with (initialization of the generator, collecting random numbers, ..). The dark side: we use an extra 32 Mbytes array!

    Combining HAVEGE with classical PRNGs

    We claim that the output sequences of HAVEGE are unpredictable.

    However some users may want to combine HAVEGE with their favorite pseudo-random number generators (PRNG) to get an extra level of confidence on the unpredictability of the sequences. The HAVEGE packages allows a simple implementation of such combinations with pseudo-random number generators (PRNG) in the two following ways.
     

  • The call to the function reading the hardware clock counter can be replaced by a call to a PRNG PRNGIN(). On each call, the data read on the hardware clock counter is inserted in the internal state of PRNGIN().
  • The output of HAVEGE can be used to update the internal state of a PRNG PNRGout(), i.e., for 32 bits produced by PRNGout(), 32 bits from HAVEGE are inserted in the internal state of PRNGout().

  • Examples of PRNGIN() and PRNGout() are provided respectively in PRNGIN+HardClock.c and PRNGOUT.c. The current functions are toy functions that can be easily  inverted (just need 16 Gigabytes :-). The user is supposed to edit these functions and replace the toy PRNGs by its own PRNGs.

    To build such combined HAVEGE + PRNG generators,

    on Linux/Unix systems type, go in the directory you have downloaded the package and type:
    setenv Arch (ARCH); setenv PLUSPRNG PLUSPRNGINOUT; make all
    on Windows systems, you must get the CygWin environment, edit  the Makefile.(ARCH) file and type:
    make -f Makefile.(ARCH) all

    How do I test unpredictability ?

    The unpredictability (i.e. uniform distribution and irreproducibility) of the random number sequences generated by HAVEGE were tested using the three following packages:
     Diehard
     Ent

    NIST statistical suite

    Online analysis implementing the NIST statiscal suite is proposed as an option . An example of the output of such an analysis is available here.
    restriction: to use online analysis on Windows systems, you must use the CygWin environment.

    A lot of interesting pointers to web sites on random numbers can be found  here

    We are interested in other uniform distribution tests and feedback on the generator: seznec@irisa.fr

    Copyright Notice

    The software uses the process " Improved random number generator " for which a french patent was registered under the number 01 13117 on October 11, 2001


     
     
     
    The software HAVEGE in its version 1.0 of the December 6, 2001 , hereinafter referred to as " the software " has been designed and produced by André Seznec, the researchers of the project CAPS, a research project of the National Computer and Automatics Institute - INRIA - Rennes Research Unit. INRIA Domaine de Voluceau Rocquencourt 78153 Le Chesnay Cedex.

    The software has been registered at the Agency for the Protection of Programmes (APP) under the number IDDN.FR.001.500017.00.S.P.2001.000.10000. This Software is ã copyright INRIA - 2001. INRIA holds all the ownership rights on the Software.

    The scientific community is asked to use the SOFTWARE in order to test and evaluate it. INRIA freely grants the right to use the Software. Any use or reproduction of this Software to obtain profit or for commercial ends being subject to obtaining the prior express authorization of INRIA.

    INRIA authorizes any reproduction of this Software
    - in limits defined in clauses 9 and 10 of the Berne agreement for the protection of literary and artistic works respectively specify in their paragraphs 2 and 3 authorizing only the reproduction and quoting of works on the condition that :

    - "this reproduction does not adversely affect the normal exploitation of the work or cause any unjustified prejudice to the legitimate interests of the author".
    - "that the quotations given by way of illustration and/or tuition conform to the proper uses and that it mentions the source and name of the author if this name features in the source",
    - under the condition that this file is included with any reproduction.
    Any commercial use made without obtaining the prior express agreement of INRIA would therefore constitute a fraudulent imitation.

    The Software being currently developed, INRIA is assuming no liability, and should not be responsible, in any manner or any case, for any direct or indirect damages sustained by the user.