In [1]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import pandas as pd

%pylab inline
Populating the interactive namespace from numpy and matplotlib

In this problem set, you will explore some simple embedding algorithms, using position-versus-time data gathered from a real driven pendulum. I have posted three data sets on the class webpage; see the PS8 entry on that page for directions (and a clickable link) to these data. In all three runs, the angle was measured every $\Delta t$ seconds using an optical encoder with a resolution of $0.4$ degree. The drive amplitude was fixed; the drive frequency (the bifurcation paramter) was different for each data set.

$\Rightarrow$In data1, the drive was turned off.

$\Rightarrow$In data2, the drive is on, with a medium frequency.

$\Rightarrow$In data3, the drive is on, with the same amplitude but a higher frequency.

Each file captures a single trajectory of the driven pendulum. Each line of each file represents a single time-sample of the pendulum's angular position. Each data point looks like this:

$$ \theta \quad \text{time}$$

...where time is in seconds and $\theta$ is $\mod 2\pi$. Depending on when I hit the reset button, $\theta$ may contain an offset, so "$\theta = 0$" may not be "vertical." Also, note that the sampling rate was different; data1 and data3 were sampled at $\Delta t = 0.001$ seconds and data2 at $\Delta t = 0.002$ seconds.

The time base and thus the sampling interval in the data acquisition channel may not have been quite uniform. Together with the finite precision of the angle sensor, this has two important implications:

$\Rightarrow$ Any $\omega$s that you reconstruct using divided differences from the $\theta$ and time data may be innaccurate. This is problem 1.

$\Rightarrow$ Nonuniform sampling violates the conditions of the Takens theorem, so any attractors constructed via embeddings of these data are not true diffeomorphic copies of any attractor that may exist in the system... but they're pretty close. We can mitigate this by using embedding intervals that are much larger than the experimental sampling interval (or by interpolation, if we knew exactly how far off our sampling interval was).

Initial bookkeeping, let's import the data now. Running on my personal computer (i7-4710HQ (2.50GHz 1600MHz 6MB), 16G RAM), pandas can easily handle the combined 10M.

In [2]:
data1 = pd.read_csv('ps8data/data1', names=['angle', '', '', 'time'],
                    sep=' ')[['angle', 'time']]
data2 = pd.read_csv('ps8data/data2', names=['angle', '', '', 'time'],
                    sep=' ')[['angle', 'time']]
data3 = pd.read_csv('ps8data/data3', names=['angle', '', '', 'time'],
                    sep=' ')[['angle', 'time']]

Problem 1

Write a program that steps through a data file, constructs values for $\omega$ using divided differences - first-order forward is good enough, but you may use something smarter if you want - and plots the results in state-space form, with $\theta \mod 2\pi$

Note that if data are oversampled - that is, if the sampling rate is much faster than the device's dynamics, as is the case in the data that you're working with here - you have to be a little careful about the divided difference formulae. In particular, you're going to need to downsample the data in order to get a sensible plot. The choice of downsampling rate is part of the thinking part of this problem.

Apply this program to data1 and turn in a plot. Since the drive is off, this plot should be a clean spiral, why? Please comment on what it really looks like, as well as on possible causes for this.

We do have to downsample for the reasons mentioned above. This downsampling rate will be determined by a downsampling parameter.

In [125]:
downsample = 100

Now we can determine our divided differences with our downsampled data. This will give the slope between every downsample multiple point, the $\omega$ of the system.

In [126]:
divided_differences = ((data1['angle'][downsample::downsample].values -
                        data1['angle'][:-downsample:downsample].values) /
                        (data1['time'][downsample::downsample].values -
                        data1['time'][:-downsample:downsample].values))

Finally, we plot. Both the scatter as well as the connected points have been plotted in order to clarify exactly what is happening. This should be a clean spiral since in a real-world, damped pendulum, the pendulum arm will slowly lose momentum and oscillate at smaller and smaller angles.

In [566]:
fig, axarr = plt.subplots(1, 2, figsize=(12, 6))
axarr[0].plot(data1['angle'][downsample::downsample].values, divided_differences)
axarr[1].scatter(data1['angle'][downsample::downsample].values, divided_differences, s=5)
axarr[0].set_xlabel(r'$\theta$')
axarr[0].set_ylabel(r'$\omega$')
axarr[1].set_xlabel(r'$\theta$')
axarr[1].set_ylabel(r'$\omega$')
plt.show()

We'll note that this is not a perfect spiral, and in fact it is not only tilted but also a somewhat inconsistent inward spiral. This is probably due to a combination of not having a perfectly constructed pendulum as well as compounded measurement error.

Problem 2

Write a program that steps through the data file and embeds the $\theta$ data, producing the corresponding trajectory in reconstruction space. This program should take a time interval $\tau$, a dimension $m$, and indices $j$, $k$ of a pair of axes on which to plot the results. It should produce a list of $m$-vectors (points in reconstruction space) each of whose $i^{th}$ element is $\theta(t + i \tau)$ for $i = 0, \ldots, m - 1$. Finally for each $m$-vector, it should plot the $j^{th}$ element against the $k^{th}$ element, both $\mod 2\pi$.

In [634]:
def embed_data(data, interval, dimension):
    """
    Returns points to embed
    
    data: pandas dataframe (with evenly spaced points)
    interval: (tau) spacing of "comb" (must go evenly into length)
    dimension: (m) dimension of points
    """
    spacing   = int(interval / (data['time'][1] - data['time'][0]))
    # Establish return array in memory
    embedding = np.zeros((len(data) - spacing * (dimension - 1),
                          dimension))
    for d in range(dimension):
        end             = spacing * (-dimension + d + 1) or None
        start           = d * spacing
        embedding[:, d] = data['angle'][start:end].values
    return embedding

def embedding(data, interval, dimension, j, k):
    """
    Plot embedding of data
    
    returns: figure
    """
    if not ((j < dimension) and (k < dimension)):
        print('j and k must be less than the dimension.')
        print('[{}, {}] >= {}'.format(j, k, dimension))
    points = embed_data(data, interval, dimension)
    fig    = plt.figure(figsize=(50, 50)) # Need to make large in order to see behavior
    ax     = fig.add_axes([0.1, 0.1, 0.8, 0.8])
    ax.scatter(points[:, j], points[:, k], s=1)
    return fig, ax, points

(a) Run your embedding program on the data2 set with $\tau = 0.15s$ and $m = 7$. Plot the zeroth elemet of the reconstructed state vector - $\theta (t)$ - on the vertical axis and the second on the horizontal axis ($j, k = 0, 2$). What kind of attractor is this?

In [632]:
fig, ax, data = embedding(data2, 0.15, 7, 2, 0)
plt.savefig('out.png')
plt.show()