Iterative rational Krylov algorithm (original) (raw)

From Wikipedia, the free encyclopedia

The iterative rational Krylov algorithm (IRKA), is an iterative algorithm, useful for model order reduction (MOR) of single-input single-output (SISO) linear time-invariant dynamical systems.[1] At each iteration, IRKA does an Hermite type interpolation of the original system transfer function. Each interpolation requires solving r {\displaystyle r} $r$ shifted pairs of linear systems, each of size n × n {\displaystyle n\times n} $n\times n$ ; where n {\displaystyle n} $n$ is the original system order, and r {\displaystyle r} $r$ is the desired reduced model order (usually r ≪ n {\displaystyle r\ll n} $r\ll n$ ).

The algorithm was first introduced by Gugercin, Antoulas and Beattie in 2008.[2] It is based on a first order necessary optimality condition, initially investigated by Meier and Luenberger in 1967.[3] The first convergence proof of IRKA was given by Flagg, Beattie and Gugercin in 2012,[4] for a particular kind of systems.

MOR as an optimization problem

[edit]

Consider a SISO linear time-invariant dynamical system, with input v ( t ) {\displaystyle v(t)} $v(t)$ , and output y ( t ) {\displaystyle y(t)} $y(t)$ :

{ x ˙ ( t ) = A x ( t ) + b v ( t ) y ( t ) = c T x ( t ) A ∈ R n × n , b , c ∈ R n , v ( t ) , y ( t ) ∈ R , x ( t ) ∈ R n . {\displaystyle {\begin{cases}{\dot {x}}(t)=Ax(t)+bv(t)\\y(t)=c^{T}x(t)\end{cases}}\qquad A\in \mathbb {R} ^{n\times n},\,b,c\in \mathbb {R} ^{n},\,v(t),y(t)\in \mathbb {R} ,\,x(t)\in \mathbb {R} ^{n}.} ${\begin{cases}{\dot {x}}(t)=Ax(t)+bv(t)\\y(t)=c^{T}x(t)\end{cases}}\qquad A\in \mathbb {R} ^{n\times n},\,b,c\in \mathbb {R} ^{n},\,v(t),y(t)\in \mathbb {R} ,\,x(t)\in \mathbb {R} ^{n}.$

Applying the Laplace transform, with zero initial conditions, we obtain the transfer function G {\displaystyle G} $G$ , which is a fraction of polynomials:

G ( s ) = c T ( s I − A ) − 1 b , A ∈ R n × n , b , c ∈ R n . {\displaystyle G(s)=c^{T}(sI-A)^{-1}b,\quad A\in \mathbb {R} ^{n\times n},\,b,c\in \mathbb {R} ^{n}.} $G(s)=c^{T}(sI-A)^{-1}b,\quad A\in \mathbb {R} ^{n\times n},\,b,c\in \mathbb {R} ^{n}.$

Assume G {\displaystyle G} $G$ is stable. Given r < n {\displaystyle r<n} $r<n$ , MOR tries to approximate the transfer function G {\displaystyle G} $G$ , by a stable rational transfer function G r {\displaystyle G_{r}} $G_{r}$ , of order r {\displaystyle r} $r$ :

G r ( s ) = c r T ( s I r − A r ) − 1 b r , A r ∈ R r × r , b r , c r ∈ R r . {\displaystyle G_{r}(s)=c_{r}^{T}(sI_{r}-A_{r})^{-1}b_{r},\quad A_{r}\in \mathbb {R} ^{r\times r},\,b_{r},c_{r}\in \mathbb {R} ^{r}.} $G_{r}(s)=c_{r}^{T}(sI_{r}-A_{r})^{-1}b_{r},\quad A_{r}\in \mathbb {R} ^{r\times r},\,b_{r},c_{r}\in \mathbb {R} ^{r}.$

A possible approximation criterion is to minimize the absolute error in H 2 {\displaystyle H_{2}} $H_{2}$ norm:

G r ∈ a r g min dim ⁡ ( G ^ ) = r , G ^ stable ‖ G − G ^ ‖ H 2 , ‖ G ‖ H 2 2 := 1 2 π ∫ − ∞ ∞ | G ( j a ) | 2 d a . {\displaystyle G_{r}\in {\underset {\dim({\hat {G}})=r,\,{\hat {G}}{\text{ stable}}}{\operatorname {arg\min } }}\|G-{\hat {G}}\|_{H_{2}},\quad \|G\|_{H_{2}}^{2}:={\frac {1}{2\pi }}\int \limits _{-\infty }^{\infty }|G(ja)|^{2}\,da.} $G_{r}\in {\underset {\dim({\hat {G}})=r,\,{\hat {G}}{\text{ stable}}}{\operatorname {arg\min } }}\|G-{\hat {G}}\|_{H_{2}},\quad \|G\|_{H_{2}}^{2}:={\frac {1}{2\pi }}\int \limits _{-\infty }^{\infty }|G(ja)|^{2}\,da.$

This is known as the H 2 {\displaystyle H_{2}} $H_{2}$ optimization problem. This problem has been studied extensively, and it is known to be non-convex;[4] which implies that usually it will be difficult to find a global minimizer.

Meier–Luenberger conditions

[edit]

The following first order necessary optimality condition for the H 2 {\displaystyle H_{2}} $H_{2}$ problem, is of great importance for the IRKA algorithm.

Theorem ([2][Theorem 3.4] [4][Theorem 1.2])— Assume that the H 2 {\displaystyle H_{2}} $H_{2}$ optimization problem admits a solution G r {\displaystyle G_{r}} $G_{r}$ with simple poles. Denote these poles by: λ 1 ( A r ) , … , λ r ( A r ) {\displaystyle \lambda _{1}(A_{r}),\ldots ,\lambda _{r}(A_{r})} $\lambda _{1}(A_{r}),\ldots ,\lambda _{r}(A_{r})$ . Then, G r {\displaystyle G_{r}} $G_{r}$ must be an Hermite interpolator of G {\displaystyle G} $G$ , through the reflected poles of G r {\displaystyle G_{r}} $G_{r}$ :

G r ( σ i ) = G ( σ i ) , G r ′ ( σ i ) = G ′ ( σ i ) , σ i = − λ i ( A r ) , ∀ i = 1 , … , r . {\displaystyle G_{r}(\sigma _{i})=G(\sigma _{i}),\quad G_{r}^{\prime }(\sigma _{i})=G^{\prime }(\sigma _{i}),\quad \sigma _{i}=-\lambda _{i}(A_{r}),\quad \forall \,i=1,\ldots ,r.} $G_{r}(\sigma _{i})=G(\sigma _{i}),\quad G_{r}^{\prime }(\sigma _{i})=G^{\prime }(\sigma _{i}),\quad \sigma _{i}=-\lambda _{i}(A_{r}),\quad \forall \,i=1,\ldots ,r.$

Note that the poles λ i ( A r ) {\displaystyle \lambda _{i}(A_{r})} $\lambda _{i}(A_{r})$ are the eigenvalues of the reduced r × r {\displaystyle r\times r} $r\times r$ matrix A r {\displaystyle A_{r}} $A_{r}$ .

Hermite interpolation

[edit]

An Hermite interpolant G r {\displaystyle G_{r}} $G_{r}$ of the rational function G {\displaystyle G} $G$ , through r {\displaystyle r} $r$ distinct points σ 1 , … , σ r ∈ C {\displaystyle \sigma _{1},\ldots ,\sigma _{r}\in \mathbb {C} } $\sigma _{1},\ldots ,\sigma _{r}\in \mathbb {C}$ , has components:

A r = W r ∗ A V r , b r = W r ∗ b , c r = V r ∗ c , A r ∈ R r × r , b r ∈ R r , c r ∈ R r ; {\displaystyle A_{r}=W_{r}^{*}AV_{r},\quad b_{r}=W_{r}^{*}b,\quad c_{r}=V_{r}^{*}c,\quad A_{r}\in \mathbb {R} ^{r\times r},\,b_{r}\in \mathbb {R} ^{r},\,c_{r}\in \mathbb {R} ^{r};} $A_{r}=W_{r}^{*}AV_{r},\quad b_{r}=W_{r}^{*}b,\quad c_{r}=V_{r}^{*}c,\quad A_{r}\in \mathbb {R} ^{r\times r},\,b_{r}\in \mathbb {R} ^{r},\,c_{r}\in \mathbb {R} ^{r};$

where the matrices V r = ( v 1 ∣ … ∣ v r ) ∈ C n × r {\displaystyle V_{r}=(v_{1}\mid \ldots \mid v_{r})\in \mathbb {C} ^{n\times r}} $V_{r}=(v_{1}\mid \ldots \mid v_{r})\in \mathbb {C} ^{n\times r}$ and W r = ( w 1 ∣ … ∣ w r ) ∈ C n × r {\displaystyle W_{r}=(w_{1}\mid \ldots \mid w_{r})\in \mathbb {C} ^{n\times r}} $W_{r}=(w_{1}\mid \ldots \mid w_{r})\in \mathbb {C} ^{n\times r}$ may be found by solving r {\displaystyle r} $r$ dual pairs of linear systems, one for each shift [4][Theorem 1.1]:

( σ i I − A ) v i = b , ( σ i I − A ) ∗ w i = c , ∀ i = 1 , … , r . {\displaystyle (\sigma _{i}I-A)v_{i}=b,\quad (\sigma _{i}I-A)^{*}w_{i}=c,\quad \forall \,i=1,\ldots ,r.} $(\sigma _{i}I-A)v_{i}=b,\quad (\sigma _{i}I-A)^{*}w_{i}=c,\quad \forall \,i=1,\ldots ,r.$

As can be seen from the previous section, finding an Hermite interpolator G r {\displaystyle G_{r}} $G_{r}$ of G {\displaystyle G} $G$ , through r {\displaystyle r} $r$ given points, is relatively easy. The difficult part is to find the correct interpolation points. IRKA tries to iteratively approximate these "optimal" interpolation points.

For this, it starts with r {\displaystyle r} $r$ arbitrary interpolation points (closed under conjugation), and then, at each iteration m {\displaystyle m} $m$ , it imposes the first order necessary optimality condition of the H 2 {\displaystyle H_{2}} $H_{2}$ problem:

1. find the Hermite interpolant G r {\displaystyle G_{r}} $G_{r}$ of G {\displaystyle G} $G$ , through the actual r {\displaystyle r} $r$ shift points: σ 1 m , … , σ r m {\displaystyle \sigma _{1}^{m},\ldots ,\sigma _{r}^{m}} $\sigma _{1}^{m},\ldots ,\sigma _{r}^{m}$ .

2. update the shifts by using the poles of the new G r {\displaystyle G_{r}} $G_{r}$ : σ i m + 1 = − λ i ( A r ) , ∀ i = 1 , … , r . {\displaystyle \sigma _{i}^{m+1}=-\lambda _{i}(A_{r}),\,\forall \,i=1,\ldots ,r.} $\sigma _{i}^{m+1}=-\lambda _{i}(A_{r}),\,\forall \,i=1,\ldots ,r.$

The iteration is stopped when the relative change in the set of shifts of two successive iterations is less than a given tolerance. This condition may be stated as:

| σ i m + 1 − σ i m | | σ i m | < tol , ∀ i = 1 , … , r . {\displaystyle {\frac {|\sigma _{i}^{m+1}-\sigma _{i}^{m}|}{|\sigma _{i}^{m}|}}<{\text{tol}},\,\forall \,i=1,\ldots ,r.} ${\frac {|\sigma _{i}^{m+1}-\sigma _{i}^{m}|}{|\sigma _{i}^{m}|}}<{\text{tol}},\,\forall \,i=1,\ldots ,r.$

As already mentioned, each Hermite interpolation requires solving r {\displaystyle r} $r$ shifted pairs of linear systems, each of size n × n {\displaystyle n\times n} $n\times n$ :

( σ i m I − A ) v i = b , ( σ i m I − A ) ∗ w i = c , ∀ i = 1 , … , r . {\displaystyle (\sigma _{i}^{m}I-A)v_{i}=b,\quad (\sigma _{i}^{m}I-A)^{*}w_{i}=c,\quad \forall \,i=1,\ldots ,r.} $(\sigma _{i}^{m}I-A)v_{i}=b,\quad (\sigma _{i}^{m}I-A)^{*}w_{i}=c,\quad \forall \,i=1,\ldots ,r.$

Also, updating the shifts requires finding the r {\displaystyle r} $r$ poles of the new interpolant G r {\displaystyle G_{r}} $G_{r}$ . That is, finding the r {\displaystyle r} $r$ eigenvalues of the reduced r × r {\displaystyle r\times r} $r\times r$ matrix A r {\displaystyle A_{r}} $A_{r}$ .

The following is a pseudocode for the IRKA algorithm [2][Algorithm 4.1].

algorithm IRKA input:

    A
    ,
    b
    ,
    c
  

{\displaystyle A,b,c}

$A,b,c$ ,

      tol
    
    >
    0
  

{\displaystyle {\text{tol}}>0}

${\text{tol}}>0$ ,

      σ
      
        1
      
    
    ,
    …
    ,
    
      σ
      
        r
  

{\displaystyle \sigma _{1},\ldots ,\sigma _{r}}

$\sigma _{1},\ldots ,\sigma _{r}$ closed under conjugation

    (
    
      σ
      
        i
      
    
    I
    −
    A
    )
    
      v
      
        i
      
    
    =
    b
    ,
    
    ∀
    
    i
    =
    1
    ,
    …
    ,
    r
  

{\displaystyle (\sigma _{i}I-A)v_{i}=b,\,\forall \,i=1,\ldots ,r}

$(\sigma _{i}I-A)v_{i}=b,\,\forall \,i=1,\ldots ,r$ % Solve primal systems

    (
    
      σ
      
        i
      
    
    I
    −
    A
    
      )
      
        ∗
    
    
      w
      
        i
      
    
    =
    c
    ,
    
    ∀
    
    i
    =
    1
    ,
    …
    ,
    r
  

{\displaystyle (\sigma _{i}I-A)^{*}w_{i}=c,\,\forall \,i=1,\ldots ,r}

$(\sigma _{i}I-A)^{*}w_{i}=c,\,\forall \,i=1,\ldots ,r$ % Solve dual systems

**while** relative change in {
  
    
      σ
      
        i
  

{\displaystyle \sigma _{i}}

$\sigma _{i}$ } > tol

      A
      
        r
      
    
    =
    
      W
      
        r
      
      
        ∗
      
    
    A
    
      V
      
        r
  

{\displaystyle A_{r}=W_{r}^{*}AV_{r}}

$A_{r}=W_{r}^{*}AV_{r}$ % Reduced order matrix

      σ
      
        i
      
    
    =
    −
    
      λ
      
        i
      
    
    (
    
      A
      
        r
      
    
    )
    ,
    
    ∀
    
    i
    =
    1
    ,
    …
    ,
    r
  

{\displaystyle \sigma _{i}=-\lambda _{i}(A_{r}),\,\forall \,i=1,\ldots ,r}

$\sigma _{i}=-\lambda _{i}(A_{r}),\,\forall \,i=1,\ldots ,r$ % Update shifts, using poles of

      G
      
        r
  

{\displaystyle G_{r}}

$G_{r}$

    (
    
      σ
      
        i
      
    
    I
    −
    A
    )
    
      v
      
        i
      
    
    =
    b
    ,
    
    ∀
    
    i
    =
    1
    ,
    …
    ,
    r
  

{\displaystyle (\sigma _{i}I-A)v_{i}=b,\,\forall \,i=1,\ldots ,r}

$(\sigma _{i}I-A)v_{i}=b,\,\forall \,i=1,\ldots ,r$ % Solve primal systems

    (
    
      σ
      
        i
      
    
    I
    −
    A
    
      )
      
        ∗
    
    
      w
      
        i
      
    
    =
    c
    ,
    
    ∀
    
    i
    =
    1
    ,
    …
    ,
    r
  

{\displaystyle (\sigma _{i}I-A)^{*}w_{i}=c,\,\forall \,i=1,\ldots ,r}

$(\sigma _{i}I-A)^{*}w_{i}=c,\,\forall \,i=1,\ldots ,r$ % Solve dual systems end while

**return** 
  
    
      A
      
        r
      
    
    =
    
      W
      
        r
      
      
        ∗
      
    
    A
    
      V
      
        r
      
    
    ,
    
    
      b
      
        r
      
    
    =
    
      W
      
        r
      
      
        ∗
      
    
    b
    ,
    
    
      c
      
        r
      
      
        T
      
    
    =
    
      c
      
        T
    
    
      V
      
        r
  

{\displaystyle A_{r}=W_{r}^{*}AV_{r},\,b_{r}=W_{r}^{*}b,\,c_{r}^{T}=c^{T}V_{r}}

$A_{r}=W_{r}^{*}AV_{r},\,b_{r}=W_{r}^{*}b,\,c_{r}^{T}=c^{T}V_{r}$ % Reduced order model

A SISO linear system is said to have symmetric state space (SSS), whenever: A = A T , b = c . {\displaystyle A=A^{T},\,b=c.} $A=A^{T},\,b=c.$ This type of systems appear in many important applications, such as in the analysis of RC circuits and in inverse problems involving 3D Maxwell's equations.[4] For SSS systems with distinct poles, the following convergence result has been proven:[4] "IRKA is a locally convergent fixed point iteration to a local minimizer of the H 2 {\displaystyle H_{2}} $H_{2}$ optimization problem."

Although there is no convergence proof for the general case, numerous experiments have shown that IRKA often converges rapidly for different kind of linear dynamical systems.[1][4]

IRKA algorithm has been extended by the original authors to multiple-input multiple-output (MIMO) systems, and also to discrete time and differential algebraic systems [1][2][Remark 4.1].

Model order reduction

^ a b c "Iterative Rational Krylov Algorithm". MOR Wiki. Retrieved 3 June 2021.
^ a b c d Gugercin, S.; Antoulas, A.C.; Beattie, C. (2008), H 2 {\displaystyle H_{2}} $H{2}$ Model Reduction for Large-Scale Linear Dynamical Systems, Journal on Matrix Analysis and Applications, vol. 30, SIAM, pp. 609–638
^ L. Meier; D.G. Luenberger (1967), Approximation of linear constant systems, IEEE Transactions on Automatic Control, vol. 12, pp. 585–588
^ a b c d e f g G. Flagg; C. Beattie; S. Gugercin (2012), Convergence of the Iterative Rational Krylov Algorithm, Systems & Control Letters, vol. 61, pp. 688–691

Model Order Reduction Wiki