- acknowledge use of this material (if you find a use for it);
- let me know via email any bugs or suggestions for improvement.

The figure below shows a typical screen shot of the univariate
demonstration software. This demo is invoked by typing **kdedemo1**
at the MATLAB prompt. Following the figure are some suggestions
of things to try using the demo.

*Figure 31: One dimensional KDE demonstration*

1. Click on the **Data** menu with the Left Mouse Button (LMB).
Choose another Dataset. The example datasets (not all archaeological!) are as follows:

(a) **Suicide Times:** the length of treatment spells (in days)
of 86 control patients in a suicide study;

(b) **Old Faithful:** 107 measurments of the eruption length
(in minutes) of the Old Faithful geyser in Yellowstone National
Park, USA;

(c) **Buffalo Snowfall:** Taken from Silverman (1986). 63 measurements
of total yearly snowfall (inches) taken at Buffalo, USA;

(d) **Pot Diameters: **the rim radii of 81 Danish neolithic
pots;

(e) **Hairpin Lengths:** the length of 224 Romano-British hairpins
from southern Britain;

(f) **Cup Diameters:** the rim diameters of 60 Bronze Age cups
from Italy;

(g) **Nickel:** The percentage nickel content of 361 French
medieval glass fragments;

(h) **Manganese:** The percentage manganese content of 361
French medieval glass fragments;

(i) **N(0,1):** 500 observations from a standard normal density
(randomised);

(j) **NMD:** 500 observations from the Normal Mixture Density
used by Wand and Jones (1995) (randomised).

(k) **Small:** A small dataset of 7 observations which can
be used in conjunction with the "Add bumps" button to
show how KDEs are formed as the sum of bumps.

2. **Kernel selection**: The default kernel function is the
normal probability density function. To change the kernel, click
with the LMB on the small downward pointing arrow to the right
of the word "Normal". A menu will appear. To select
another kernel function simply click on the name of the kernel
function. To see the shape of the kernel function choose the
"Small" dataset and add the bumps.

*Figure 32: A KDE as a sum of "bumps" (Laplace
kernel)*

3. **Adaptive KDEs**: You can turn the adaptive method on and
off by clicking in the box to the left of the word "Adaptive".
If the box is empty the adaptive method is NOT used. Again,
choosing "Add bumps" will show how the Adaptive KDE
is formed.

4. **Selecting the value of h**: To vary the value of the smoothing
parameter h, click somewhere on the numeric value of h (in the
box at the top right of the demo). A small cursor will appear.
You can now delete the current value and type in your own value
using the keyboard. Try increasing or decreasing the value from
that which is displayed above the KDE plot. For example, for
the Cup diameters data, the default value (calculated using a
normal scale rule) is 2.46, so try values of h=3, 2, 1.5, and
1 to see what effect the smoothing parameter has. Note that if
the data are multimodal then the normal scale value of h will
tend to oversmooth. In such cases it is particularly important
to reduce the value of h.

5. **Automatic selection of h**: Many methods of automatically
selecting an "optimal" value of h have been developed.
Several of these have been implemented and are available to try
within the demo. The default method of h selection is the normal
scale rule, which will tend to oversmooth non-normal data. Clicking
on the small arrow to the right of "Normal scale" will
reveal a list of other h selection strategies, including Direct-Plug-In
(DPI), Solve-The-Equation (STE) and various Cross-Validation (CV)
methods. At this time, the methods based on Cross-Validation
are rather slow; hopefully this will be improved in the future.
While there is no overall "best" method of automatic
h selection, Wand and Jones (1995) suggest that the STE method
offers good overall performance.

6. **Canonical Kernels and fixing the value of h**: Different
Canonical kernels give similar KDEs for the same value of h.
The same is not true of standard kernels. This can be seen by
fixing a value of h, turning Hold on, and plotting several KDEs
with different kernel functions. With standard kernels very different
KDEs will result. However, using canonical kernels the resulting
KDEs should all appear roughly the same.

7.** Producing multiple plots and changing the print style and colour**:
Say you wanted to compare the KDEs obtained using different values
of h by plotting them on the same axes. To do this, obtain the
first plot (for example use h=1 for the Cup diameters data) then
click in the box to the left of the word "Hold". Subsequent
plots will now be added to the same axes allowing comparisons
to be made directly. To see this, try changing the value of h
to 2. You can also change the plot style of the *next* plot
by selecting an entry from the "Print style" menu.
This is useful when a colour printer is not available. A recent addition (not illustrated on the figures in this document) is a "Colour" menu, which allows you to choose the colour of the *next* plot.

8. **Plotting Histograms**: This can be achieved using the
features in the following box

Clicking the check box turns the histogram plot on and off, while you can alter the number of bins by entering a new value in the appropriate place. Note that the histogram is added to the current plot.

9. **Sampling**: You can take samples from a dataset and study
the difference between the KDE of the sample and the KDE of the
whole using the following box

Ssize controls the size of the sample while the check box lets you use sampling with or without replacement. The default is sampling without replacement. Using this feature together with hold on lets you plot multiple KDEs on the same axes for comparison.

10. **Exiting**: Click on the "End" button to exit.**
**

The Figure below shows a typical screen shot of the bivariate
demonstration software. This demo is invoked by typing **kdedemo2**
at the MATLAB prompt. Following the figure are some suggestions
of things to try using the demo. The default dataset is "Cup
Diameter/Height". The five datasets supplied are:

(a) **Glass composition**: Na2O and MgO composition of 361
samples of French medieval glass.

(b) **Artifact location**: x and y co-ordinates describing
the location of 276 bone splinters (Mask site data).

(c) **Cup diameter/height**: Neck diameter and height of 60
Bronze Age cups from Italy.

(d) **Leicester/Mancetter**: The first two components of a
PCA based upon chemical composition of 105 specimens of Romano-British
waste glass.

(e) **N(0,1)**: 500 observations from the bivariate standard
normal density.

*Figure 33: Two-dimensional KDE demonstration*

1. **Data, Kernel, Adaptive, Hold, End** and** Help** functions
are similar to those in the univariate demo and are not described
further here. Note that "Hold" only works for 2D plots
such as contours and percentage contours.

2. **Altering the angle of view**: Two sliders perform this
function. You can alter the viewing angle to the left and right,
or up and down. An overhead view is often quite useful and is
obtained by sliding the vertical slider all the way to its highest
position.

3**. Altering the colour scheme**: Many different colour schemes
are provided via the Colour menu. Use this in the same way as
the Data menu. If you don't like using colour, select "white".

4. **Smoothed shading**: Clicking on the check box to the left
of the word "Shading" causes the 3D plot to be smoothed.
This can be quite effective when viewed from above (see point
2 above). Clicking the box again turns off the shading.

5. **Mesh Granularity**: By clicking here you can alter the
size of the grid on which the KDE is plotted. (Type in a new
value.) We have used a default value of 32 as a compromise between
speed and realism. Increasing this value to say 64 will result
in a more pleasing image, but longer computing times. The minimum
value supported is 8.

6. **Altering the value of h**: This works in the same way
as for the univariate demo, except that as the data are bivariate,
two smoothing parameters are used, one each for the x and y directions.
Click here to alter these values. For example, consider the
Cup data. The normal scale values of the smoothing parameters
are 2.379 and 0.5091. To change these to say, 2 and 1 respectively
enter** [2,1]**. (The square brackets are necessary, the comma
is optional - just use a space if you wish.) Entering a single
number, e.g. 1.5, assigns *both *smoothing parameters to that
value. Automatic selection of the h values is again supported
via the appropriate pop-up menu, though the approach taken is
one of calculating separate h values for each variable using univariate
techniques.

7. **Contouring**: A contour plot of the KDE can be obtained
by selecting "Contour" from the pop-up menu in the top
left corner of the figure. To return to a Surface view, select
"Surface". You can add a scatter plot of the data to
the contour plot by clicking on the "Scatter" check
box.

8. **Percentage Contouring**: Because this technique requires
estimates of the height of the KDE function at all the data points,
it can be time consuming on slower PCs or for large datasets.
When the calculation has been performed a percentage contour
plot is presented. To change the position of the contour lines,
use the Contour %'s box in the lower left of the figure. For
example, if you wanted to draw the 50 and 100% contours, enter
**[50,100]**. To draw the 10%, 20%, ..., 90% contours enter
**[10:10:90]**.

In either the uni- or bivariate case it is possible to import your own data into the demo routines. For example, at the MATLAB prompt typing

**>> data=randn(100,1);**

**>> kdedemo1**

will invoke the uni-dimensional demo with a random sample of 100
data points from a standard normal density. If you have data
stored on disk in ASCII format, use the **load** command (type
**help load** at the MATLAB prompt for details) to import the
data into MATLAB. For use with the KDE demos your data must be
stored in a matrix structure called **data**.

Since MATLAB is always running in the background you have a great deal of flexibility in influencing the demo routines. The following show just a couple of the things that are possible.

__ Example 1__: Changing the axis limits.

When you have a 2D image in the demo Window (this will occur only
when you're using the Bivariate demo) you can alter the axes
by clicking in the MATLAB command window and using the **axis**
command to define the *x* and *y* axis limits. This
command works like so:

**>> axis([xmin xmax ymin ymax])**

where xmin, xmax, etc. are numerical values separated by spaces.

__ Example 2__: Investigating subgroups within data.

The Leicester/Mancetter dataset has been divided into two groups
based upon site of origin. These two groups have been stored in
**lmgp1** and **lmgp2. **At the MATLAB prompt enter

**>> data=lmgp1(:,1:2);**

This assigns columns 1 and 2 of the dataset **lmgp1** to **data**.
Returning to the demo Window, select "Percent Contour"
to see a contour plot of the dataset. Now click on "Hold"
and select a new "Print style" from the appropriate
menu. Returning to the MATLAB command Window type:

**>> data=lmgp2(:,1:2);**

Now return to the demo Window and click on "Normal" from the Kernels pop-up menu. This forces the routine to do a calculation based on the new dataset and the percentage contour will appear on the same axes as that for the first group of the dataset.

*Figure 34: Investigating subgroups within data *

© Internet Archaeology
URL: http://intarch.ac.uk/journal/issue1/beardah/kdehelp.html

Last updated: Tue Sep 10 1996