Working Group Matrix Computations and Statistics

First workshop

Geneva, 4-5 May, 2001

Programme

Friday, 4th of May, 2001
------------------------

8h50-9h00   Openings

9h00-9h45   Jocelyne Erhel
            Iterative solvers for large sparse linear systems 
            (.ps file)

9h45-10h30  Stratis Gallopoulos
            Towards effective methods for computing matrix
            pseudospectra (.ps file)  

10h30-11h00 Coffee Break

11h00-11h45 Bernard Philippe 
            Parallel computation of the smallest singular values of a
            matrix (.ps file) 

11h45-12h30 George Moustakides
            Eigen-decomposition of a class of infinite dimensional
            block tridiagonal matrices (.ppt file) 

12h30-14h00 Lunch

14h00-14h45 Lars Eldén
            Solving quadratically constrained least squares problems using 
            a differential-geometric approach (.ps file) 
 
14h45-15h30 Bart De Moor
            Least-squares support vector machines  (.ps file)

15h30-16h00 Coffee Break

16h00-16h45 Hans Bruun Nielsen
            Algorithms and applications of Huber estimation 
            (.ps file) 

16h45-17h30 Roger Payne
            Analysis of variance, general balance and large data sets  
            (.ppt file) 

17h30-18h15 Erricos Kontoghiorghes, Paolo Foschi and Cristian Gatu
            Solving linear model estimation problems (.ps file)  
            
19h00-...   Dinner


Saturday, 5th of May 2001
-------------------------

9h00-9h45   Giorgio Pauletto and Manfred Gilli
            Solving economic and financial problems with parallel
            methods (.ps file)

9h45-10h30  Zahari Zlatev
            Computational challenges in the treatment of large-scale
            environmental models (.ppt file) 

10h30-11h00 Coffee Break

11h00-11h45 Rasmus Bro
            New multi-way models and algorithms for solving blind
            source separation problems  (.pdf file)

11h45-12h30 Discussion

12h30-14h00 Lunch

Abstracts

Iterative solvers for large sparse linear systems Jocelyne ErhelProjet Aladin, IRISA/INRIA-Rennes Campus de Beaulieu, 35042 Rennes cedex, France Large sparse linear systems arise in many scientific applications. Krylov iterative methods require less memory storage than direct methods, they can even be matrix-free, with no matrix storage, since the only operation required is matrix-vector product. For symmetric positive definite matrices, the method of choice is Conjugate Gradient. For indefinite matrices or unsymmetric matrices, the situation is far more complicated. Several methods exist, with no best one. In all cases, a difficult point is to find an efficient preconditioning method, which speeds-up the convergence at a low computational cost. Examples are given to illustrate the properties of Krylov methods. The final example comes from the discretisation of a 3D biharmonic problem. It shows that a promising preconditioning approach is by multigrid or multilevel methods. ------------------------------------------------------------------- Towards effective methods for computing matrix pseudospectra Stratis Gallopoulos Department of Computer Engineering & Informatics Patras 26500, Greece Given a matrix A, the computation of its pseudospectrum, that is the locus of eigenvalues of matrices of the form A+E, for E bounded in norm by some small epsilon, is a far more expensive task than the computation of characteristics such as the condition number and the matrix spectrum. As research of the last 15 years has shown, however, the matrix pseudospectrum provides valuable information that is not included in the other indicators. So the question is how to compute it efficiently and how do we build a tool that would facilitate engineers and scientists to make such analyses? We will consider this problem from the point of view of 1) the basic computational kernels, 2) domain based information, 3) parallelism and 4) the programming environment and will provide a review of our recent efforts on this subject. ------------------------------------------------------------------- Parallel computation of the smallest singular values of a matrix Bernard Philippe Projet Aladin, IRISA/INRIA-Rennes Campus de Beaulieu, 35042 Rennes cedex, France The smallest singular value of a matrix can be seen as its distance to singularity. For linear systems, it provides the condition number of the system to solve. When several singular values are zeros, it is often necessary to know the numerical rank of the matrix and to build a basis of the null space. The computation of the smallest singular value also occurs when determining the pseudospectrum of matrix for the purpose of analysing the sensitivity of eigenvalues with respect to matrix perturbations. There are different ways to compute such singular values depending on the considered matrices. For dense matrices, the method of choice is implemented in LAPACK. For large and sparse matrices, there exist different methods which are based on the construction of subspaces of small or medium dimensions (Lanczos, Davidson, Trace Minimization). We shall discuss their respective advantages as well as their efficiency on parallel computers. ------------------------------------------------------------------- Eigen-decomposition of a class of infinite dimensional tri-diagonal matrices George V. Moustakides Department of Computer Engineering and Informatics, University of Patras, 26 500 Patras, Greece We consider the eigen-decomposition problem of a special class of infinite dimensional block tri-diagonal matrices. Such problems occur in linearly polarized laser fields when one attempts to analyze the scattering of free electrons. We show that by using Discrete Fourier Transform we can reduce the original infinite dimensional eigen-decomposition problem into a combination of a linear system of differential equations and a new eigenvalue-eigenvector problem both being of the size of a single block. We consider numerical integration methods for the solution of the differential equation that respect the special structure of its solution. Competing methods are compared with respect to their accuracy and computational complexity. Finally we propose an FFT based technique that can efficiently compute the eigenvectors of the original infinite dimensional problem, from the corresponding eigenvectors of the finite dimensional one. ------------------------------------------------------------------- Solving quadratically constrained least squares problems using a differential-geometric approach Lars Eldén Department of Mathematics, Linköping University, SE-581 83, Linköping, Sweden email: laeld@math.liu.se A quadratically constrained linear least squares problem is usually solved using a Lagrange multiplier for the constraint and then solving numerically a nonlinear secular equation for the optimal Lagrange multiplier. It is well-known that, due to the closeness to a pole for the secular equation, standard methods for solving the secular equation can be very slow. The problem can be reformulated as that of minimizing the residual of the least squares problem on the unit sphere. Using a differential-geometric approach we formulate Newton's method on the sphere, and thereby avoid the difficulties associated with the Lagrange multiplier formulation. This Newton method on the sphere can be implemented efficiently, and since its convergence is often quite fast it appears to be superior to the Lagrange multiplier method. A few numerical examples are given. We also discuss briefly the extension to orthogonal Procrustes problems. ------------------------------------------------------------------- Least-squares support vector machines Bart De Moor Department Electrical Engineering, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10 B-3001 Leuven-Heverlee Belgium LS-SVM is a recently developed parametrization for non-linear multi-dimensional regression problems. Data vectors are non-linearly mapped, via so-called feature vectors, into a high-dimensional space (possibly infinite), in which the regression problem can be formulated as a regularized least squares problem. Via a so-called Mercer condition and by using vectors of Lagrange multipliers, the problem reduces to solving a square set of linear equations, of the size of the number of data points (which might be large). In this talk, we will develop the main ideas behind LS-SVM and comment on the computational challenges for large regression problems. We will also show how in the same framework, the Mercer condition allows for the formulation of non-linear kernel PCA (principal components analysis) and non-linear kernel CCA (canonical correlations analysis). ------------------------------------------------------------------- Algorithms and applications of Huber estimation Hans Bruun Nielsen Department of Mathematical Modelling, Technical University of Denmark Huber's M-estimator was proposed for robust parameter estimation. It can be interpreted as a combination of L1- and L2-estimators: A threshold T distinguishes between small and large residuals, contributing to the objective function with their square and their absolute value, respectively. There results a piecewise quadratic, convex function. We present an efficient method for finding a minimizer of this 'Huber function'. Further, the algorithm has a number of applications: If T goes to zero, then the 'Huber solution' converges piecewise linearly to an L1 solution of the problem, and the active set is identified for a strictly positive value of T, i.e. while it still has a smoothing (beneficial) effect on the iteration. This approach can be generalized to an algorithm for linear programming (LP) problems. Finally, we show how the Huber algorithm can be used to solve box constrained quadratic programming (QP) problems. ------------------------------------------------------------------- Analysis of variance, general balance and large data sets Roger Payne IACR-Rothamsted, Harpenden, Herts, AL5 2JQ, U.K. General balance is relevant to experimental designs with several block (or error) terms. The total sum of squares can then be partitioned up into components known as strata, one for each block term, containing the sum of squares for the treatment terms estimated between the units of that stratum and a residual representing the random variability of those units. The properties of a generally balanced design are that (1) the block (or error) terms are mutually orthogonal, (2) the treatment terms are also mutually orthogonal, and (3) the contrasts of each treatment term all have equal efficiency factors in each of the strata where they are estimated. The treatment estimates in the various strata can be calculated very efficiently using the algorithm of Payne & Wilkinson (1976). This does not require the formation and inversion of matrices of sums of squares and products between treatment effects. Instead it involves a sequence of "sweep" operations (which can be formulated as projection operators). When treatment effects are estimated in several stratum, estimates that combine this information can be calculated using the algorithm of Payne & Tobias (1992). These two algorithms can handle models with large numbers of treatments much more efficiently than conventional methods. However, data sets of this size arising from recent application areas such as data mining will usually be unbalanced. Related algorithms exist that can handle unbalanced data (see e.g. Hemmerle 1974 and Worthington 1975), and it might be interesting to see how these compare with conventional methods on large data sets. ------------------------------------------------------------------- Solving linear model estimation problems Erricos Kontoghiorghes, Paolo Foschi and Cristian Gatu Institut d'informatique, Universite de Neuchatel, Rue Emile-Argand 11, Case Postale 2, CH-2007 Neuchatel, Switzerland Algorithms for solving large-scale linear models are considered. Numerically stable methods for computing the least-squares estimators and regression diagnostics of the standard and general linear models, seemingly unrelated regression and simultaneous equations models are discussed. The least-squares computations are based on orthogonal transformations, in particular the QR decomposition. ------------------------------------------------------------------- Solving economic and financial problems with parallel methods Giorgio Pauletto and Manfred Gilli Department of Econometrics, University of Geneva, Uni Mail 40 Bd du Pont d'Arve CH-1211 Genève 4 Switzerland The use of computational methods in economics and finance has greatly increased in the recent years. We will present some of the recent studies carried out in macroeconometric model simulation and financial options valuation here at the Department of Econometrics of University of Geneva. We will also give an overview of future applications of computational intensive methods in economics and finance. The simulation of large macroeconometric models containing forward-looking variables can become impractical when using exact Newton methods. In such cases, Krylov methods provide an interesting alternative. We also discuss a block preconditioner suitable for the particular class of models solved and study the parallel solution of the sparse linear system arising in the Newton algorithm. Finance is another area of application of intensive computational methods. We examine the valuation of options on several underlying assets using finite difference methods. Implicit methods, which have good convergence and stability properties, can be implemented efficiently thanks to Krylov solvers. A parallel implementation on a cluster of PCs is carried out showing that large problems can be efficiently tackled particularly when a fine grid space is needed. ------------------------------------------------------------------- Computational challenges in the treatment of large-scale environmental models Zahari Zlatev National Environmental Research Institute Department for Atmospheric Environment Frederiksborgvej 399, P. O. Box 358 DK-4000 Roskilde, Denmark Environmental models are typically described by systems of partial differential equations (PDEs) that are defined on large space domains. The number of equations is equal to the number of species that are to be studied by the model. The PDE system can be transformed, by using of splitting and discretization techniques to systems of linear algebraic equations that are to be treated numerically during many (typically several thousand) time-steps. The size of these systems depends on the resolution. High resolution 3-D models, which are defined on a (480x480x10) grid covering the whole of Europe, contain more than 80 million equations when a chemical scheme with 35 species is adopted in the model. Although it is desirable to solve this problem, it is still not possible to achieve this even when the fastest available computers are used. This is why some simplifications are always introduced in order to make the problems tractable numerically. In the above example, the problem can be solved by using 2-D versions of the model, by which the computational tasks are, roughly speaking, reduced by a factor of ten. Many theoretical and numerical problems have to be solved in an efficient way in order to improve the existing environmental problems. Some of these problems will be sketched in this talk. ------------------------------------------------------------------- New multi-way models and algorithms for solving blind source separation problems Rasmus Bro Chemometrics Group, Dept. of Dairy and Food Science The Royal Veterinary and Agricultural University Rolighedsvej 30, DK-1958 Frederiksberg C, Denmark In the sixties and seventies, new models were developed in psychometrics, for handling data that consists of several sets of matrices. Such data can be arranged in a box rather than a table, and are called multi-way data. Analogously to standard two-way techniques such as principal component analysis/SVD, methods exist for decomposing multi-way data. One particular of these methods, namely the PARAFAC model (PARallel FACtor analysis) is very interesting in that it provides parameters that are unique up to simple scaling and permutation. Having access to two or more matrices which follow the same bilinear model, only in different proportions, complete identification of the underlying components in the mixtures can be obtained. This is in contrast to the problems of rotational indeterminacy in traditional bilinear modeling and is of practical importance in a number of disciplines such as psychology, chemistry and DSP. Examples will be given of different models as well as a discussion of current algorithms and their problems.