### 4 Site statistics

For the site-typological analysis of the Michelsberg phase, 23 culturally clean sites are present in the core region of Venray. These data will be used to find out how to proceed analytically. The artefact composition of the sites, as dated by the presence of Michelsberg pottery, macrolithic artefacts, pointed blades and/or three types of points with invasive surface retouch, has been reproduced in Table 2. In a visual assessment it is relatively difficult to recognize in this table various types of settlements at a glance. The relatively low numbers of artefacts and numerous empty cells contribute to this. A suitable technique for organizing this information is seriation (Herzog & Scollar 1987). By moving the rows or columns the structure of the data (for instance sites with more points, tools or flint working) can be expressed better. Multivariate statistical techniques, such as cluster, factor or correspondence analysis, seem also suited to this problem. After all, several artefact types together determine the character of the site. A thorough knowledge of the applicability, preconditions and limitations of the various mathematical techniques is however essential. In addition, the wide range of statistical techniques often leads to the testing of numerous different procedures on the data set (Houtsma et al. 1996). Of all the available results we very often select the one that best fits the picture of the structure in the data; it may have already been in the back of our minds on the basis of the collection of the data, our archaeological experience, simple counts and diagrams.

Table 2 Michelsberg sites, artefact numbers
SITELOC DXHDRUP BLADSPIT MACKRB BORNOT RETBYL AARDMAAL KLOPKERN OVERTOT VAR
52A- 56 1 1 1
52B- 48 1 1 1
52B- 80 1 1 1
52B- 85 1 1 1
52B-108 1 1 2 2
52B-111 1 1 2 2
52B-165 1 2 3 2
52B-168 1 2 4 7 3
52B-192 v 1 28 29 2
52B-217 1 1 1
52B-223 v 1 3 2 63
52C- 44 1 1 1
52D- 5 v 1 11
52E- 31 1 14 111 212 4 230 820897 10
52E-121 1 1 22
52E-126 1 1 2 2
52E-141 1 11
52E-150 1 34 22 9 14 550 909
52E-171 1 11
52E-188 3 1013 2
52E-201 1 11
52E-217 1 12 2
52E-223 1 2 32

where:
SITE=site number, LOC=location (v=field co-ordinates), DXH=triangular arrowhead with invasive surface retouch, DRUP=teardrop arrowhead, BLAD=leaf-like arrowhead, SPIT=pointed blade, MAC=macrolithic artefact, KRB=scraper, BOR=borer, NOT=notch, RET=retouched blade or flake, BYL=axe, AARD=pottery, KLOP=hammer stone, KERN=core, OVER=other artefacts, TOT=total number of artefacts, VAR= number of different artefact types

less detail

Number of artefacts per site

Seriation

Multivariate statistics

Exploratory Data Analysis

Percentage of artefacts per site

Data robustness

Ordinal or nominal level of information

Accidental, few, present, abundant

Graphical visualisation of numerical information therefore seems to be a useful alternative. Instead of reducing the variation in our archaeological data to one or two mathematical parameters, the actual spread is judged visually. From Exploratory Data Analysis (EDA), tools are available for this (Tukey 1977) such as box-and-whiskers for univariate information and Tukey stars for multivariate data. The latter seems eminently suitable for increasing our insight into the site typology.

First of all, however, we should pay some attention to the data we want to display graphically. Due to the small numbers of artefacts in the table these data have little stability. A new visit to the site may significantly change the numbers. As a result, the percentages are not very robust either, causing a reduction of the data to a lower level of measurement to be necessary. Reduction to the ordinal level (more production debris than scrapers and more scrapers than macrolithic artefacts) or the nominal level (production debris, scrapers and macrolithic artefacts are present, borers and pottery absent) are both possible. We feel that data on presence or absence leave us too little room to perform a site-typological analysis. We therefore prefer a reduction of the numbers to ordinal classes, with class intervals increasing with the absolute number. Such a progressive classification would result in classes that may be referred to as accidental (1-3 finds), very few (4-15), few (16-42), ranging through to numerous (422-614), very numerous (615-857) and abundant (858-1157).

On a small site, scrapers and macrolithic artefacts occur accidentally and there is little production debris. Such a progressive classification provides on the one hand enough stability, so that a new visit will not cause any dramatic changes in the class values, and on the other hand rare artefact types such as scrapers will become almost as important as frequently occurring artefact types such as flakes. The progressive classification used here is closely related to the use of logarithms, but archaeologically it has more appeal.

A graphic analysis of the artefact counts, reduced to ordinal classes, is the methodological road we want to take. In doing so we accept that the results will be more intuitive and statistically less verifiable, with fewer significant and more indicative differences.

Table 3 Michelsberg sites, progressive class values
SITELOC DXHDRUP BLADSPIT MACKRB BORNOT RETBYL AARDMAAL KLOPKERN OVER
52A- 56 1
52B- 48 1
52B- 80 1
52B- 85 1
52B-108 1 1
52B-111 1 1
52B-165 1 1
52B-168 1 1 2
52B-192 v 1 3
52B-217 1
52B-223 v 1 1 1
52C- 44 1
52D- 5 v 1
52E- 31 1 2 2 1 1 2 2 1 3 9
52E-121 1 1
52E-126 1 1
52E-141 1
52E-150 1 1 2 1 1 2 2 2 4
52E-171 1
52E-188 1 2
52E-201 1
52E-217 1 1
52E-223 1 1

where:
SITE= site number LOC=location (v=field co-ordinates), DXH=triangular arrowhead with invasive surface retouch, DRUP=teardrop arrowhead, BLAD=leaf-like arrowhead, SPIT=pointed blade, MAC=macrolithic artefact, KRB=scraper, BOR=borer, NOT=notch, RET=retouched blade or flake, BYL=axe, AARD=pottery, KLOP=hammer stone, KERN=core, OVER=other artefacts
in the following classes:
1=1-3, 2=4-15, 3=16-42, 4=43-91 and 9=615-857 artefacts