Multidimensional Scaling (MDS) is a linear embedding method used when we only know the pairwise distances between data points. For linear systems, MDS works well with as little as 10 points and the system is 2 dimensional. For this program, you choose the window size or embedding for the system and MDS identifies what the true embedding should be. The following program uses Logistic map data when r = 3.2 but you can change r to see how well MDS works as the map bifurcates and eventually becomes chaotic and nonlinear at 4. MDS breaks down at this point and the embedding defaults to just 1 over the embedding you choose.

# Requires Numpy and Scipy libraries
from numpy import *
import scipy.linalg
import random
r=3.2 # parameter on Logistic map
n = 10 # the number of timeseries points to use
d = 40 # your embedding choice
# Create a time series
def timeseries(number,throw, d):
xList = []
dataX0 = []
t = 0
while t < d:
xList.append(.1)
t += 1
while t <= number+throw+d:
x = r * xList[t-1] * (1 - xList[t-1]) # Logistic Map
xList.append(x)
if t > throw:
y = xList[t-1]
dataX0.append(x)
t = t + 1
return dataX0
# Construct an n x n centering matrix
# The form is P = I - (1/n) U where U is a matrix of all ones
def centering_matrix(n):
P = eye(n) - 1/float(n) * ones((n,n))
return P
def create_data(number, d):
dataX0 = timeseries(number, 500, d)
# Create a sliding window of size d
data = []
i = d
while i < len(dataX0):
data.append([])
j = i - d
while j < i:
data[len(data)-1].append(dataX0[j])
j = j + 1
i = i + 1
return data
def main():
print "n =",n, ": number of timeseries points used"
print "d =",d, ": embedding"
P = centering_matrix(n)
# Create the data with the number of dimensions
data = create_data(n, d)
X = pairwise_distances(data)
A = -1/2.0 * P * X * P
# Calculate the eigenvalues/vectors
[vals, vectors] = scipy.linalg.eig(A)
# Sort the values
vals = sort(vals)
embedding = 0
for x in vals:
if x > 10**(-10):
embedding += 1
print "mds embedding =",embedding
# Compute the pairwise distance between vector x and y
def metric(x1, y1):
d = 2
summ = []
i = 0
while i < len(x1):
# in this case use euclidean distance
summ.append((x1[i] - y1[i])**d)
i = i + 1
return sum(summ) ** (1 / float(d))
# Return a matrix of pairwise distances
def pairwise_distances(data):
distances = []
i = 0
while i < len(data):
distances.append([])
x1 = data[i]
j = 0
while j < len(data):
y1 = data[j]
distances[i].append(metric(x1, y1)**2)
j = j + 1
i = i + 1
return distances
main()

Wolf et al. (1985) outlined an algorithm that estimates the Lyapunov spectra of systems whose equations are known using local Jacobian matrices and Gram-Schmidt orthonormalization. Python code is available for Wolf’s algorithm and discrete maps and their inverted counterparts. I have adapted this code to estimate Lyapunov spectra for continuous-time systems like the Lorenz attractor and Rossler attractor. Additionally, Python code is available to generate time series for ordinary differential equations. Lyapunov spectrum code is also available on Clint Sprott’s website.

Source Code:

import math, operator, random
h = 0.01
cmax = 1000 # number of iterations to perform
choice = input("Which system would you like?\n (1) Rossler \n (2) Lorenz\n")
while str(choice) != "1" and str(choice) != "2":
print("Please select 1 or 2")
choice = input("Which system would you like?\n (1) Rossler \n (2) Lorenz\n")

print "\n",
a = 0.2
b = 0.2
c = 5.7
def derivs(x, xnew, n):
if choice == 1:
return Rossler(x, xnew, n)
else:
return Lorenz(x, xnew, n)
def Rossler(x, xnew, n):
# Nonlinear Rossler Equations
xnew[1] = -x[2]-x[3]
xnew[2] = x[1] + a * x[2]
xnew[3] = b + x[3] * (x[1] - c)
# Linearized Rossler Equations
xnew[4] = -1*x[7]-x[10]
xnew[5] = -1*x[8]-x[11]
xnew[6] = -1*x[9]-x[12]
xnew[7] = x[4] + a*x[7]
xnew[8] = x[5] + a*x[8]
xnew[9] = x[6] + a*x[9]
xnew[10] = x[3]*x[4] + x[1]*x[10] - c*x[10]
xnew[11] = x[3]*x[5] + x[1]*x[11] - c*x[11]
xnew[12] = x[3]*x[6] + x[1]*x[12] - c*x[12]
return [x, xnew]
def Lorenz(x, xnew, n):
# Nonlinear Lorenz Equations
xnew[1] = 10 * (x[2] - x[1])
xnew[2] = -1*x[1] * x[3] + 28 * x[1] - x[2]
xnew[3] = x[1] * x[2] - 8/3.0 * x[3]
# Linearized Lorenz Equations
xnew[4] = -10 * x[4] + 10 * x[7]
xnew[5] = -10 * x[5] + 10 * x[8]
xnew[6] = -10 * x[6] + 10 * x[9]
xnew[7] = 28*x[4]-x[3]*x[4] - x[7] - x[1]*x[10]
xnew[8] = 28*x[5]-x[3]*x[5] - x[8] - x[1]*x[11]
xnew[9] = 28*x[6]-x[3]*x[6] - x[9] - x[1]*x[12]
xnew[10] = x[2]*x[4] + x[1]*x[7] - 8/3.0 * x[10]
xnew[11] = x[2]*x[5] + x[1]*x[8] - 8/3.0 * x[11]
xnew[12] = x[2]*x[6] + x[1]*x[9] - 8/3.0 * x[12]
return [x, xnew]
def timeseries(cmax):
X0 = []
Y0 = []
Z0 = []
xList = []
yList = []
zList = []
changeInTime = h
# Initial conditions
if choice == 1:
# Rossler
X0.append(0.01)
Y0.append(0.01)
Z0.append(0.01)
else:
# Lorenz
X0.append(0)
Y0.append(1)
Z0.append(0)
t = 0
while len(xList) <= cmax:
[x,y,z] = Rk4o(X0, Y0, Z0, h, len(X0))
X0.append(x)
Y0.append(y)
Z0.append(z)
if 200 < t:
xList.append(x)
yList.append(y)
zList.append(z)
changeInTime += h
t = t + 1
return [xList, yList, zList]
def f(x,y,z):
if choice == 1:
dxdt = -y-z
else:
dxdt = 10 * (y - x)
return dxdt
def g(x,y,z):
if choice == 1:
dydt = x + a * y
else:
dydt = 28 * x - y - x*z
return dydt
def e(x,y,z):
if choice == 1:
dzdt = b + z * (x - c)
else:
dzdt = x * y - 8/3.0 * z
return dzdt
def Rk4o(xList, yList, zList, h, t):
k1x = h*f(xList[t-1],yList[t-1], zList[t-1])
k1y = h*g(xList[t-1],yList[t-1], zList[t-1])
k1z = h*e(xList[t-1],yList[t-1], zList[t-1])
k2x = h*f(xList[t-1] + k1x/2,yList[t-1] + k1y/2, zList[t-1] + k1y/2)
k2y = h*g(xList[t-1] + k1x/2,yList[t-1] + k1y/2, zList[t-1] + k1y/2)
k2z = h*e(xList[t-1] + k1x/2,yList[t-1] + k1y/2, zList[t-1] + k1y/2)
k3x = h*f(xList[t-1] + k2x/2,yList[t-1] + k2y/2, zList[t-1] + k2y/2)
k3y = h*g(xList[t-1] + k2x/2,yList[t-1] + k2y/2, zList[t-1] + k2y/2)
k3z = h*e(xList[t-1] + k2x/2,yList[t-1] + k2y/2, zList[t-1] + k2y/2)
k4x = h*f(xList[t-1] + k3x/2,yList[t-1] + k3y/2, zList[t-1] + k3y/2)
k4y = h*g(xList[t-1] + k3x/2,yList[t-1] + k3y/2, zList[t-1] + k3y/2)
k4z = h*e(xList[t-1] + k3x/2,yList[t-1] + k3y/2, zList[t-1] + k3y/2)
x = xList[t-1] + k1x/6 + k2x/3 + k3x/3 + k4x/6
y = yList[t-1] + k1y/6 + k2y/3 + k3y/3 + k4y/6
z = zList[t-1] + k1z/6 + k2z/3 + k3z/3 + k4z/6
return [x,y,z]
n = 3 # number of variables in nonlinear system
nn=n*(n+1) # total number of variables (nonlinear + linear)
m = 0
x = []
xnew = []
v = []
ltot = []
znorm = []
gsc = []
A = []
B = []
C = []
D = []
i = 0
while i <= nn:
x.append(0)
xnew.append(0)
v.append(0)
A.append(0)
B.append(0)
C.append(0)
D.append(0)
i = i + 1
i = 0
while i <= n:
ltot.append(0)
znorm.append(0)
gsc.append(0)
i = i + 1
irate=10 # integration steps per reorthonormalization
io= 100 # number of iterations between normalization
# initial conditions for nonlinear maps
# must be within the basin of attraction
# Generate a random transient before starting the initial conditions
i = 1
while i <= n:
v[i] = 0.001
i = i + 1
transient = random.randint(n,100000)
# Generate the initial conditions for the system
[tempx,tempy,tempz] = timeseries(transient)
v[1] = tempx[len(tempx)-1]
v[2] = tempy[len(tempy)-1]
v[3] = tempz[len(tempz)-1]
i = n+1
while i <= nn: # initial conditions for linearized maps
v[i]=0 # Don't mess with these; they are problem independent!
i = i + 1
i = 1
while i <= n:
v[(n+1)*i]=1
ltot[i]=0
i = i + 1
#print "v = ",v
t=0
w = 0
while (w < cmax):
j = 1
while j <= irate:
i = 1
while i <= nn:
x[i]=v[i]
i = i + 1
[x, xnew] = derivs(x, xnew, n)
i = 1
while i <= nn:
A[i] = xnew[i]
x[i] = v[i] + (h*A[i]) / 2.0
i = i + 1
[x, xnew] = derivs(x, xnew, n)
i = 1
while i <= nn:
B[i] = xnew[i]
x[i] = v[i] + (h*B[i]) / 2.0
i = i + 1
[x, xnew] = derivs(x, xnew, n)
i = 1
while i <= nn:
C[i] = xnew[i]
x[i] = v[i] + h*C[i]
i = i + 1
[x, xnew] = derivs(x, xnew, n)
i = 1
while i <= nn:
D[i] = xnew[i]
v[i] = v[i] + h*(A[i] + D[i] + 2*(B[i] + C[i]))/6.0
i = i + 1
t = t + h
j = j + 1
#construct new orthonormal basis by gram-schmidt:
znorm[1]=0 #normalize first vector
j = 1
while j <= n:
znorm[1]=znorm[1]+v[n*j+1]**2
j = j + 1
znorm[1] = math.sqrt(znorm[1])
j = 1
while j <= n:
v[n*j+1]=v[n*j+1]/znorm[1]
j = j + 1
#generate new orthonormal set:
j = 2
while j <= n:
k = 1
while k <= j-1:
gsc[k]=0
l = 1
while l <= n:
gsc[k]=gsc[k]+v[n*l+j]*v[n*l+k]
l = l + 1
k = k + 1
k = 1
while k <= n: #construct a new vector
l = 1
while l <= j-1:
v[n*k+j]=v[n*k+j]-gsc[l]*v[n*k+l]
l = l + 1
k = k + 1
znorm[j]=0 #calculate the vector's norm
k = 1
while k <= n: #construct a new vector
znorm[j]=znorm[j]+v[n*k+j]**2
k = k + 1
znorm[j]=math.sqrt(znorm[j])
k = 1
while k <= n: #normalize the new vector
v[n*k+j] = v[n*k+j] / znorm[j]
k = k + 1
j = j + 1
k = 1
while k <= n: #update running vector magnitudes
if znorm[k] > 0:
ltot[k] = ltot[k] + math.log(znorm[k])
k = k + 1
m = m + 1
if m % io == 0 or w == cmax-1: #normalize exponent and print every io iterations
lsum=0
kmax=0
k = 1
while k <= n:
le = ltot[k] / t
lsum = lsum + le
if lsum > 0:
lsum0 = lsum
kmax = k
k = k + 1
w = w + 1
if choice == 1:
print "Rossler:"
else:
print "Lorenz:"
print n, "LEs = "
lsum=0
kmax=0
k = 1
while k <= n:
le = ltot[k] / t
lsum = lsum + le
if lsum > 0:
lsum0 = lsum
kmax = k
print le
k = k + 1

Similar to Neural Networks, Support Vector Machines (SVM) are powerful modelling techniques in machine learning that can be applied to a variety of tasks. SVMs are primarily used to classify data while one of its variants, Support Vector Regression (SVR), can be used for time series analysis. In this post, we will perform SVR on chaotic time series using common Python libraries and a simple wrapper program.

The following libraries will be required to use the script I will demonstrate using Python 2.7:

Numpy

Scipy

Scikits.Learn (a toolbox that hosts a variety of machine learning algorithms and requires Numpy and Scipy)

MatPlotLib (used for graphing and visualization)

Other libraries such as SetupTools may be required depending on your system

With the packages installed, we can begin to build a program that fits an SVR model to chaotic data. First, we define a time series. In this case, I will use the Delayed Henon Map with a delay of 4, for more information please see my posts on that system.

The model will be embedded in a d dimension space.

def DelayedHenon(cmax, transient, d):
temp = []
rF = [] # Features
rT = [] # Targets
dX = []
i = 0
while i < d:
dX.append(0.1)
i = i + 1
while i <= cmax + transient:
x = 1 - 1.6 * dX[i-1] ** 2 + 0.1 * dX[i-4]
if i > transient-d:
temp.append(x)
if i > transient:
rT.append(x)
dX.append(x)
i = i + 1
rF = SlidingWindow(temp, d)
return {"F":rF, "T":rT}

I added a function that formats the data properly so it can be used by the SVR Fit function:

def SlidingWindow(x, d):
i = d
y = []
while i < len(x):
temp = []
j = 1
while j <= d:
temp.append(x[i - d + j])
j = j + 1
y.append(temp)
i = i + 1
return y

We then define the model, note that SVR has a few user defined parameters which can be chosen to fit the data better.

In this case, you can choose the following parameters to produce a model with an error near 0.0035 for 1024 points taken from the time series. In practice, you can create a training algorithm to find these values.

gam = 0.117835847192
c = 1.02352954164
eps = 0.00163811344978
tolerance = 0.00539604526663
d = 5
cmax = 1024

Finally, we create functions to fit the time series and plot the results:

def Fit(cmax, d, tolerance, gam, c, eps):
data = TimeSeries(cmax, random.randint(0,10000), d)
test = TimeSeries(cmax, random.randint(0,10000), d)
x = data["F"]
y = data["T"]
sout = test["T"]
clf = svm.SVR(kernel='rbf', degree=d, tol=tolerance, gamma=gam, C=c, epsilon=eps)
try:
m = clf.fit(x, y)
except:
return [1000, clf, sout, sout]
mout = clf.predict(test["F"])
# Determine the error for the system
err = 0
i = 0
while i < len(mout):
err += (mout[i] - sout[i])**2
i = i + 1
err = math.sqrt(err / (len(mout)+.0))
return [err, clf, mout, sout]
def Plotter(err, mout, sout, SaveGraph):
plt.title("SVR: Err=" + str(err))
plt.scatter(sout[0:len(sout)-2], sout[1:len(sout)-1], s=1, marker='o', edgecolors='none')
plt.scatter(mout[0:len(mout)-2], mout[1:len(mout)-1], s=1, marker='o', edgecolors='none', facecolor='r')
plt.show()

if SaveGraph:
try:
plt.savefig("svr")
except:
# Do Nothing
SaveGraph = SaveGraph
plt.clf()

Here is the final program after putting all of the functions together (and moving some stuff around) with example output:

#!/usr/bin/env python
from scikits.learn import svm
import random
import math
import matplotlib.pyplot as plt
d = 5 # Degree of the SVR model
cmax = 1024 # Number of points to run
def TimeSeries(cmax, transient, d):
return DelayedHenon(cmax, transient, d)
def main():
global cmax, d, perturbation
gam = 0.117835847192
c = 1.02352954164
eps = 0.00163811344978
tolerance = 0.00539604526663
# Create and Fit a model to data, we just need to create support vectors
[err, svr, mout, sout] = Fit(cmax, d, tolerance, gam, c, eps)
Plotter(err, mout, sout, False)
print "error: ", err

def DelayedHenon(cmax, transient, d):
temp = []
rF = [] # Features
rT = [] # Targets
dX = []
i = 0
while i < d:
dX.append(0.1)
i = i + 1
while i <= cmax + transient:
x = 1 - 1.6 * dX[i-1] ** 2 + 0.1 * dX[i-4]
if i > transient-d:
temp.append(x)
if i > transient:
rT.append(x)
dX.append(x)
i = i + 1
rF = SlidingWindow(temp, d)
return {"F":rF, "T":rT}

def Fit(cmax, d, tolerance, gam, c, eps):
data = TimeSeries(cmax, random.randint(0,10000), d)
test = TimeSeries(cmax, random.randint(0,10000), d)
x = data["F"]
y = data["T"]
sout = test["T"]
clf = svm.SVR(kernel='rbf', degree=d, tol=tolerance, gamma=gam, C=c, epsilon=eps)
try:
m = clf.fit(x, y)
except:
return [1000, clf, sout, sout]
mout = clf.predict(test["F"])
# Determine the error for the system
err = 0
i = 0
while i < len(mout):
err += (mout[i] - sout[i])**2
i = i + 1
err = math.sqrt(err / (len(mout)+.0))
return [err, clf, mout, sout]
def Plotter(err, mout, sout, SaveGraph):
plt.title("Delayed Henon Map and SVR Model")
p1 = plt.scatter(sout[0:len(sout)-2], sout[1:len(sout)-1], s=1, marker='o', edgecolors='none')
p2 = plt.scatter(mout[0:len(mout)-2], mout[1:len(mout)-1], s=1, marker='o', edgecolors='none', facecolor='r')
plt.legend([p2, p1], ["Model (Err="+ str(round(err,4))+")", "Map"], loc=8)
plt.draw()
plt.show()
if SaveGraph:
try:
plt.savefig("svr")
except:
# Do Nothing
SaveGraph = SaveGraph
plt.clf()
def SlidingWindow(x, d):
i = d
y = []
while i < len(x):
temp = []
j = 1
while j <= d:
temp.append(x[i - d + j])
j = j + 1
y.append(temp)
i = i + 1
return y
def WriteFile(err, BestTol, BestGam, BestC, BestEps, s):
# Write the model, sensitivities to a file
f = open("sensitivities", "w")
f.write("Model Parameters:\n")
f.write("d=" + str(d) + "\n")
f.write("gam=" + str(BestGam) + "\n")
f.write("c=" + str(BestC) + "\n")
f.write("eps=" + str(BestEps) + "\n")
f.write("tolerance=" + str(BestTol) + "\n")
f.write("Model Data:\n")
f.write("err=" + str(err) + "\n")
f.write("s=" + str(s) + "\n")
f.close()
main()

Example Output:

err: 0.00381722720161

To fit data taken from other systems, simply change the Timeseries function.

One can estimate the Lyapunov spectrum of dynamical systems and their inverted counterparts using local Jacobian matrices and Wolf’s algorithm. Basically, Jacobian matrices are calculated at each point in a trajectory and multiplied together to form a product matrix whose eigenvalues represent the Lyapunov exponents for the system studied. More specifically, these exponents measure the divergence of a ball of initial conditions as they move around an attractor, in this case, a strange attractor. As the Jacobians are multiplied together, Gram-Schmidt reorthonormalization is used to maintain the system of coordinates and unify the divergence because the ball of initial conditions quickly becomes an ellipisoid.

In 1985, Alan Wolf et al. published the paper that outlined a program that can be used to determine the spectrum of Lyapunov exponents for system’s whose equations are known. Wolf’s algorithm requires that the equations are linearized. After performing the necessary calculations, one can plug them into the program available here (in Python) and estimate the Lyapunov spectrum.

Here is an example of how to linearize the Henon map and more complex Tinkerbell map:

def tinkerbell(x, xnew, n):
a = 0.9
b = -0.6
c = 2.0
d = 0.5
# Nonlinear
xnew[1] = x[1]**2 - x[2]**2 + a*x[1] + b*x[2] # x
xnew[2] = 2*x[1]*x[2] + c * x[1] + d * x[2] # y
# Linearized
xnew[3] = 2 * x[1] * x[3] + a * x[3] - 2 * x[2] * x[5] + b * x[5] # delta x
xnew[4] = 2 * x[1] * x[4] + a * x[4] - 2 * x[2] * x[6] + b * x[6] # delta x
xnew[5] = 2 * x[2] * x[3] + c * x[3] + 2 * x[1] * x[5] + d * x[5] # delta y
xnew[6] = 2 * x[2] * x[4] + c * x[4] + 2 * x[1] * x[6] + d * x[6] # delta y
return [x, xnew]

To estimate the Lyapunov spectrum of the inverted system. One can simply reverse the signs on the exponent values of the forward system or take the more roundabout way and estimate them using Wolf’s algorithm.

To estimate the exponents, it is necessay to obtain to invert the system’s equations. In some cases, this is not possible such as the case of the Logistic map where each point could come from one of two previous points. To estimate the inverted system’s Jacobians (and the inverted system’s equations), one can simply invert the Jacobian matrix of the forward equations.

For the Henon map:

For the Tinkerbell map (and those with a magnifying glass):

The last issue that needs to be solved is generating data for the system. Since the system is inverted, the system has most likely turned from an attractor to a repellor and thus any trajectory will wander off to infinity. Therefore, we use the forward system’s equations and use the linearizations for the inverted system to estimate the Lyapunov spectrum. You can use the following Python program and plug in the code above to see an example.

Using the iterative 4-th order Runge-Kutta method as described here, we can create low dimensional slices of the system’s attractor known as Poincare Sections, Return maps, or Recurrence maps.

with a, b, and c set to 0.2, 0.2, and 5.7, respectively.

Poincare sections are important for visualizing an attractor that is embedded in greater than 3 spatial dimensions. Additionally, certain dynamical and topological properties are invariant to these transformations such as Lyapunov exponents and bifurcation diagrams (See Sprott 2003). However, the Lyapunov exponent with a value of 0 is lost due to the removal of the direction of the flow.

To build a Poincaré section, imagine that you have selected a slice of phase space. Every time the orbit passes through that slice, we collect a point. We repeat this over and over to create an image of what the attractor looks like in that slice of space.

We can also find local maximums to create a return map. In the following graph, we look for points that are local maximums and plot them versus the previous local maximum.

The following Python code uses the 4-th order Runge-Kutta method to create this return map.

import math, operator, random
dataX0 = []
dataY0 = []
dataZ0 = []
yList = []
xList = []
zList = []
count = 1
t = 1
x = -9
y = 0
z = 0
xList.append(x)
yList.append(y)
zList.append(z)
h = .01
a = .2
b = .2
c = 5.7
def f(x,y,z):
global a,b,c
dxdt = -y-z
return dxdt
def g(x,y,z):
global a,b,c
dydt = x + a * y
return dydt
def e(x,y,z):
global a,b,c
dzdt = b + z * (x - c)
return dzdt
def rk4o(x, y, z):
global h
k1x = h*f(x, y, z)
k1y = h*g(x, y, z)
k1z = h*e(x, y, z)
k2x = h*f(x + k1x/2.0, y + k1y/2.0, z + k1z/2.0)
k2y = h*g(x + k1x/2.0, y + k1y/2.0, z + k1z/2.0)
k2z = h*e(x + k1x/2.0, y + k1y/2.0, z + k1z/2.0)
k3x = h*f(x + k2x/2.0, y + k2y/2.0, z + k2z/2.0)
k3y = h*g(x + k2x/2.0, y + k2y/2.0, z + k2z/2.0)
k3z = h*e(x + k2x/2.0, y + k2y/2.0, z + k2z/2.0)
k4x = h*f(x + k3x, y + k3y, z + k3z)
k4y = h*g(x + k3x, y + k3y, z + k3z)
k4z = h*e(x + k3x, y + k3y, z + k3z)
x = x + k1x/6.0 + k2x/3.0 + k3x/3.0 + k4x/6.0
y = y + k1y/6.0 + k2y/3.0 + k3y/3.0 + k4y/6.0
z = z + k1z/6.0 + k2z/3.0 + k3z/3.0 + k4z/6.0
return [x,y,z]
X0 = []
Y0 = []
Z0 = []
previousLocalMax = x
localMax = x
t = 1
changeInTime = h
avgChange = []
LastPoint = changeInTime
while changeInTime < 20000 and len(dataX0) < 1000:
[x,y,z] = rk4o(xList[t-1], yList[t-1], zList[t-1])
xList.append(x)
yList.append(y)
zList.append(z)
if 201 < changeInTime:

# Print y and z points when x is between 3.5 and 3.75 (as shown in the figure)
if x < 3.75 and x > 3.5:
print y , "," , z
if x < xList[t-1] and xList[t-2] < xList[t-1]:
previousLocalMax = localMax
localMax = xList[t-1]
dataX0.append(localMax)
dataY0.append(previousLocalMax)
# Calculate the change between this and last point
avgChange.append(changeInTime - LastPoint)
LastPoint = changeInTime
t = t + 1
changeInTime += h

# Print the average steps between points
print "Average Number of Steps between points:", sum(avgChange[1:]) / float(len(avgChange[1:]))
print "First Return Map Length: " + str(len(dataX0))
f = open("rosseler.csv", "w")
i = 0
while i < len(X0) and i < 10000:
f.write(str(X0[i]) + "," + str(Y0[i])+ "," + str(Z0[i])+"\n")
i = i + 1
f.close()
f = open("fr-rossler-x", "w")
i = 0
while i < len(dataX0):
f.write(str(dataX0[i])+"\r")
i = i + 1
f.close()

Ordinary Differential Equations (ODEs) can be used to define systems in fields as varied as biology to engineering to mathematics. These equations express a relationship between an unknown function and its derivative. One example, the Rossler attractor,

We can produce a time series from these equations by solving the equations using the iterative 4-th order Runge-Kutta method and plugging each of the solutions back into the equations. Along with the following script which allows you to implement the Runge-Kutta method on ODEs, I have included code to numerically estimate the largest Lyapunov exponent. The largest Lyapunov exponent is used to indicate chaos, or sensitive dependence to initial conditions, within a system. For the Rossler attractor, defined parameters (a=b=.2, c=5.7), and initial conditions (x=-9, y=z= 0), the largest Lyapunov exponent is about 0.0714. Numerous resources are available for more information about ordinary differential equations and other systems that you may want to explore with this script.

import math, random

# Step Size
h = .001
#Initial conditions for Rossler system
x = -9
y = 0
z = 0
# Parameters for the Rossler System
a = .2
b = .2
c = 5.7
# The perturbation used to calculate the Largest Lyapunov exponent
perturb = .000000001
# Functions that define the system
def f(x,y,z):
global a,b,c
dxdt = -y-z
return dxdt
def g(x,y,z):
global a,b,c
dydt = x + a * y
return dydt
def e(x,y,z):
global a,b,c
dzdt = b + z * (x - c)
return dzdt
# randomly perturb the initial conditions to create variable time series
x = x + random.random() / 2.0
y = y + random.random() / 2.0
z = z + random.random() / 2.0
dataX0 = []
dataY0 = []
dataZ0 = []
yList = []
xList = []
zList = []
lamdaList = []
lyapunovList = []
t = 1
xList.append(x)
yList.append(y)
zList.append(z)
# Use the 4th order Runge-Kutta method
def rk4o(x, y, z):
global h
k1x = h*f(x, y, z)
k1y = h*g(x, y, z)
k1z = h*e(x, y, z)
k2x = h*f(x + k1x/2.0, y + k1y/2.0, z + k1z/2.0)
k2y = h*g(x + k1x/2.0, y + k1y/2.0, z + k1z/2.0)
k2z = h*e(x + k1x/2.0, y + k1y/2.0, z + k1z/2.0)
k3x = h*f(x + k2x/2.0, y + k2y/2.0, z + k2z/2.0)
k3y = h*g(x + k2x/2.0, y + k2y/2.0, z + k2z/2.0)
k3z = h*e(x + k2x/2.0, y + k2y/2.0, z + k2z/2.0)
k4x = h*f(x + k3x, y + k3y, z + k3z)
k4y = h*g(x + k3x, y + k3y, z + k3z)
k4z = h*e(x + k3x, y + k3y, z + k3z)
x = x + k1x/6.0 + k2x/3.0 + k3x/3.0 + k4x/6.0
y = y + k1y/6.0 + k2y/3.0 + k3y/3.0 + k4y/6.0
z = z + k1z/6.0 + k2z/3.0 + k3z/3.0 + k4z/6.0
return [x,y,z]
t = 1
changeInTime = h
startLE = True
while changeInTime < 20000: # Perform 20000 / h iterations
[x,y,z] = rk4o(xList[t-1], yList[t-1], zList[t-1])
xList.append(x)
yList.append(y)
zList.append(z)
if 200 < changeInTime: # Remove the transient after 200 / h iterations
if startLE:
cx = xList[t-1] + perturb
cy = yList[t-1]
cz = zList[t-1]
startLE = False
# Calculate the Largest Lyapunov Exponent
[cx, cy, cz] = rk4o(cx, cy, cz)
delx = cx - x
dely = cy - y
delz = cz - z
delR1 = ((delx)**2+(dely)**2+(delz)**2)
df = 1.0 / (perturb**2) * delR1
rs = 1.0 / math.sqrt(df)
cx = x + rs*delx
cy = y + rs*dely
cz = z + rs*delz
lamda = math.log(df)
lamdaList.append(lamda)
#if t % 1000 == 0: # Print the Lyapunov Exponent as you go
# print t, " ", .5*sum(lamdaList) / (len(lamdaList)+.0) / h
t = t + 1
changeInTime += h
lyapunov = .5*sum(lamdaList) / (len(lamdaList)+.0) / h
print lyapunov

# Output the x-component to a file
f = open("rossler-x", "w")
i = 0
while i < len(dataX0):
f.write(str(dataX0[i])+"\r")
i = i + 1
f.close()

Artificial neural networks can be applied to the delayed Henon map[1] and shown to replicate the sensitivities[2] of the map surprisingly well. Models such as neural networks have a rich history with numerous resources available that describe there use in tasks that range from automated driving to medical diagnosis.

The network I will describe is much simpler and only estimates the sensitivities of the delayed Henon map. This network is a single-layer feedforward network that is optimized on next-step prediction. The network, shown below, involves a layer of inputs that connect to a single layer of hidden nodes with some weight . The weighted inputs are then transformed by an activation function, in this case a hyperbolic tangent, within each node and the output is the sum of these hidden node values weighted by . The network shown schematically,

The weights, and are an n x d matrix and n x 1vector of real numbers, respectively. The 1 is a bias term that shifts neuron’s value around without being tied to an input. The neural network can be represented by,

where d is the embedding dimension or number of inputs in the network and n is the number of neurons in the hidden layer. The network is trained on input data by altering the weights to better fit the target data . For the delayed Henon map, we feed d sequential points from the time series into the network and associate the target as the next point in the time series. The network is trained to fit that next point.

There are numerous network topologies, training methods, and error functions that one can use. One method we use is similar to simulated annealing and hill climbing. In this case, we search a neighborhood of potential solutions with the chance to randomly search a more distant one. If a good solution is found, we move to its neighborhood and start searching again. We slowly shrink the neighborhood size as training progresses to help home in on a good solution. A good solution is one that minimizes the average one-step mean-square distance between predictions from the neural network and the actual data ,

Since the neural network model equations are known, we can easily analyze the sensitivity of a network trained on experimental data such as the delayed Henon map. Following the same procedure detailed in Delayed Henon Map Sensitivities, we can take the partial derivative of the function with respect to each of the inputs j,

We could also use a numerical partial derivative instead by perturbing each input one by one and averaging the change in output through the time series.

After training one neural network, with 4 neurons and 5 dimensions, on 512 points from the delayed Henon map with a delay set to four, the following training error, e=8.4 x 10^{-6}, and sensitivities were found,

where S(j) represents the sensitivities for time lag j. The delayed Henon map with a delay of four has the following sensitivities,

S(1) = 1.8980
S(4) = 0.1

The neural network estimates the sensitivities fairly well and one could probably train longer to obtain more accurate sensitivities.

One question we asked in this blog series concerned the inverted delayed Henon map. There are two approaches that we could take to find the sensitivities of the inverted map, (1) invert the neural network trained on the righted map or (2) train a network on the inverted map data.

It is not easy to invert a neural network though there are many papers about training a network on data in a way similar to the original training method. Ref #3 highlights how one would do this by training the network on inputs using gradient descent. However, we can find the sensitivities of the inverted delayed Henon map by simply training on data taken from the righted map and reversing the entire time series. After training a network with the same number of neurons and dimensions as described above, we arrive at (e=0.001572),

The inverted delayed Henon map has the following sensitivities calculated here,

S(3)=18.9795
S(4)=10

As we see from the difference in the righted and inverted trained networks, sensitivity accuracy varies. One idea is that the accuracy of the training error is correlated to the error in the sensitivities but I am unaware of any literature exploring this.

With this training method, we do not know when a network is optimized so there is a trade off in accuracy and time spent training the neural network. This particular example trained for several hours. If time is an issue, you could easily trade out the neural network with another model such as Support Vector Regression.

Once the model has been optimized on the data, you could take the partial derivatives of the finalized equation with respect to each of the inputs or perform a numerical partial derivative. In this case, it is also important to calculate the same perturbation in each of the time lags of the original system. For the delayed Henon map with 512 points, this changes the sensitivities to 1.90594 for the first delay and .10000 for the d-th delay.

If you try other models, I would like to hear about it.

A neural network model and simple mathematical systems such as the delayed Henon map help us approach complex systems such as the weather, politics, or economics. We can explore simple systems like the delayed Henon map and look at interesting properties that these systems possess such as the sensitivity of the output to each of the inputs. Aside from this property, we could analyze trained neural networks to see if they replicate the Kaplan-Yorke dimension, fractal dimension, or many others. Creating algorithms that estimate these properties fairly well on simple systems may help us to understand more complex phenomena in the future.

References:

Sprott JC. High-dimensional dynamics in the delayed Hénon map. Electron J Theory Phys 2006; 3:19–35.

Maus A. and Sprott JC. Neural network method for determining embedding dimension of a time series. Comm Nonlinear Science and Numerical Sim 2011; 16:3294-3302.

offers insight into the high-dimensional dynamics through its adjustable d parameter. In the previous post, Delayed Henon Map Sensitivities, we looked at the sensitivity of the output to perturbations in each of the time lags of this map using partial derivatives. Since this function is known, it is simple to determine the lag space as well as the embedding dimension for the system. However, as we will see later, we can use this method to find the lag space and embedding dimension for unknown systems using artificial neural networks. Some interesting questions arise when you analyze neural networks trained on the delayed Henon map and its inverted counterpart so we will first look at the inverted delayed Henon map.

The delayed henon map can be inverted quite easily by separating the time delayed form into two equations such as the following,

After a little algebra, you get

Finally, we replace with to obtain the inverted map,

After inversion, the map becomes a repellor and any initial condition on the original attractor will wander off to infinity. Since we cannot directly estimate the sensitivities from this function we can calculate a time series from the righted delayed Henon map and feed that data into the partial derivative equations of the inverted delayed Henon map. Using this process on 10,000 points from the righted delayed Henon map (the first 1,000 points removed) we obtain the following sensitivities,

Here are a few other maps, their sensitivities, their inverted equations, and those sensitivities:

Original Henon map [1]

Equation

Sensitivities

Inverted Equation

Inverted Sensitivities

Discrete map from preface of Ref. #2

Equation

Sensitivities

Inverted Equation

Inverted Sensitivities

References:

Henon M. A two-dimensional mapping with a strange attractor. Commun Math Phys 1976;50:69–77.

Sprott JC. Chaos and time-series analysis. New York: Oxford; 2003.

What previous days’ weather affected today’s weather? Historically, which states’ votes affected the outcome of different presidential elections? How does a single trade affect the price of a stock?

Modelling the weather, politics, and economics would be a very difficult task but we can explore questions like this in less complex mathematical systems such as the delayed Henon map [1]. The delayed Henon map is a time-delayed system represented by following equation,

Since it has an adjustable d parameter, which represents what dimension the function can be embedded in and provides us with a knob to turn to explore high dimensional dynamics. We can explore many different features about this map, the correlation dimension, fractal dimension, Lyapunov exponents, and much more but our primary focus will be looking at the sensitivities for this map. If you are interested in more information about this map, please consult Sprott 2005, a full reference is below. For the rest of this post, I will be using d=4. As we will see, this is more than adequate for the analysis I will show, but one could easily use d = 1456 if they wanted to. Due to the linearity of , a different choice of d will arrive at almost the same results.

So, how are the three questions posed above tied together? They all attempt to answer a common question. What is the sensitivity of each time lag in a function of the system [2]?

For the delayed Henon map, this would be akin to asking, how does affect . We can infer this by taking the partial derivative of with respect to each time lag. Using a partial derivative is like asking, if I vary just slightly, how will change. For the obviously non-zero time lags that would be,

To accurately determine how much the output of the function varies when each time lag is perturbed, we need to find the mean of the absolute values of the partial derivatives around the attractor. Thus we have,

Since the delayed Henon map has a chaotic attractor and the values of vary, you can estimate the value of the sensitivities for 10,000 iterations of the time series. Initializing the map with a vector such as [.1, .1, .1, .1], we get the following strange attractor (with the first 1,000 iterations removed),

Delayed Henon Map with 9,000 points (d=4)

We arrive at the following sensitivities for d=4,

By estimating a system’s sensitivities, we can determine what is known as the lag space [3]. Dimensions with non-zero sensitivities make up this space and the largest dimension with a non-zero sensitivity also determines the embedding dimension. As we will see in another post, a neural network method can be devised to find the lag space of various time series.

References:

Sprott JC. High-dimensional dynamics in the delayed Hénon map. Electron J Theory Phys 2006; 3:19–35.

Maus A. and Sprott JC. Neural network method for determining embedding dimension of a time series. Comm Nonlinear Science and Numerical Sim 2011; 16:3294-3302.

Goutte C. Lag space estimation in time series modelling. In: Werner B, editor. IEEE International Conference on Acoustics, Speech, and SignalProcessing, Munich, 1997, p. 3313.