Peter Foley and Richard Kirk explain how a simple benchmarking
model can assist managers in assessing the consistency of
officer inspection rating scores
Accurate and reliable management controls are vital to any modern
and progressive inspectorate. The model outlined in this article,
a simple statistical technique, is designed to help managers within
the environmental health profession achieve a precise and meaningful
comparison of officer inspection rating scores and can be used to
supplement existing management systems.
By applying techniques such as this, managers can be better placed
to reduce the "guess work" associated with the inspection
management and verification process, hopefully resulting in a more
transparent and proportional approach.
BACKGROUND
The inspection rating of premises following routine inspections
by officers is an issue that requires continual review and assessment
by managers in order to determine and validate both accuracy and
consistency. Numerous types of visits can be prioritised through
the application of a suitable inspection rating score to determine
the frequency of subsequent visits. As well as the nationally recognised
systems under the Food Safety Act Code of Practice 9 and HELA LAC
67/1 (Revised), many other local schemes operate for work such as
licensing and EPA visits.
Several potential mechanisms can be adopted in order to promote
the consistency of inspection rating, all of which have their place
in a modern inspection regime, including:
desk-top exercises;
accompanied inspections;
post inspection follow-up visits;
random checks on premises records; and
peer review.
There are, however, a number of limitations with all of these
controls which prevent managers from obtaining an overall picture
or from seeing how trends in inspection work may be developing.
Legitimate differences may exist between staff at different levels,
ie between newly qualified and experienced officers or between EHOs
and technical support staff, which the above controls would not
be able to examine. The initial aim of this project was to identify
how individual officers rated premises in comparison with their
peer group. This in turn highlighted the need to objectively and
critically evaluate the whole issue of inspection rating management.
METHODOLOGY
Initially, using data from routine food hygiene inspections, a graph
of each officer's annual inspection scores was constructed, which
enabled a frequency distribution curve to be plotted for each officer.
In order for the curves to be directly comparable, the distribution
took account of the percentage of scores falling within specific
rating bands, rather than merely comparing the number of inspections
in each band.The study was further expanded across two authorities
(A and B) to consider the issues with data obtained from 21 officers
who had undertaken a total of 865 inspections.
To enable a true inter-authority comparison, a profile of the
rating scores for all premises in each authority was first obtained.
Because of the diversity of premises in any one district, the distribution
curves of officers in different authorities could only be comparable
if each authority had a similar profile of food premises, ie certain
categories were not widely disproportionate. The total number of
premises considered in the district profile was 2,391 (950 at authority
A and 1,441 at authority B). The exercise was useful in its own
right to enable individual comparisons between officers at each
authority. The scores for both EHOs and technical officers at each
separate authority were then averaged to allow a meaningful inter-authority
comparison.
RESULTS
The frequency curves largely follow a normal distribution. The general
trend of the officer's curves shows consistency within peer groups,
and that EHOs are inspecting the higher risk premises as expected.
This demonstrates that there is an extremely good correlation between
the percentage of food premises within each of the rating bands
in each of the authorities. If the types of premises are largely
comparable, it follows that the pattern of rating individual premises
by the officers of each authority might also be similar. With the
average rating scores for the EHOs in each authority, again there
is generally a good correlation, except for scores between 80 and
100, where officers from authority A rated more frequently. The
mean inspection score for EHOs was 64.
With the average rating scores for technical officers of both
authorities, the mean score of 52 shows that technical officers
are generally responsible for inspections at the lower end of the
risk range. This reflects the fact that technical officers without
the Higher EHORB Certificate generally inspect lower risk premises
in categories C and below. The main differences between the average
technical officer scores of the authorities are that over 30 per
cent of authority B inspections have a hazard rating of 50, compared
with only 10 per cent at authority A. Only 5 per cent of authority
B inspections have a hazard rating of 60 compared with 20 per cent
at authority A. These differences raise a number of important questions
about the way in which these technical officers are working. The
individual frequency curves created for a number of experienced
EHOs working in both authorities showed a fairly consistent pattern.
A number of atypical frequency curves can also be seen. EHO 1
was employed on a project basis in one of the authorities to inspect
a specific group of category B premises. EHO 2 was newly qualified
and had not been given authority to inspect premises in risk categories
A and B, the curve is consistent with this. The exercise was especially
useful in highlighting an anomaly with the output of one officer
in particular, whose scoring was dramatically different to that
of the peer group. The curve for EHO 3 serves to illustrate how
staff may be apportioning inaccurate scores, or selectively choosing
to inspect premises of lower risk. The curve for EHO 4, an officer
from an external agency, also peaks in the lower risk categories.
This raises the question as to whether these scores truly represented
those premises inspected or whether the scoring was consistently
lower than that of other EHOs?
OBSERVATIONS
Having succeeded in the original objective of comparing officer
scoring within a peer group, it then became apparent that the resultant
frequency distribution model had a wider application for a whole
range of benchmarking purposes. These include:
assessment of officers on temporary short-term contracts;
comparing the scoring of similar groups, ie EHOs and technical
officers;
giving positive feedback to staff confirming the accuracy of
their assessments;
looking at longer-term inspection trends affecting the department;
and
comparing ratings between officers at different authorities.
Such a model is dependent on the availability and accuracy of
the chosen data ranges over time. This will be dependent on the
type of computer software used and its flexibility in reporting.
Where officer discrepancies become apparent, the question arises
as to whether the score should be modified by the manager, so as
to bring one officer's scoring back into line with their peers.
This may be particularly relevant if irregularities have been taking
place for some time and are not isolated errors. The entire inspection
programme could be jeopardised by careless or malicious instances
of inconsistent scoring. There may be limitations to the use or
application of the model, which should be given consideration by
those wishing to introduce similar monitoring mechanisms, for example:
local anomalies with IT;
data anomalies or inaccuracies;
influence of transient or agency type staff;
statistical limitations and interpretation;
changes to the scoring regime in the codes of practice could
make former data incompatible over time; and
differences in authorities where a "figures driven"
approach to inspection is favoured above a qualitative style.
EXPANDING THE MODEL
The frequency distribution principal illustrated above can be easily
adapted to produce a range of other assessment profiles such as:
district profile, ie hazard by Ward or Parish;
risk category profile, ie expected or desired curve for particular
operations;
specific types of premises (including those subject to formal
action), specific MAFF and HSW categories;
by linking to outcomes, for example confidence in management;
and
longer-term assessment of management scores, illustrating "success"
or "failure".
Many of these examples could be developed as a national or county-wide
benchmark standard by which more accurate and meaningful auditing
could take place locally, for example comparing the rating profile
of all school kitchens across a county area. Over time it may be
possible to see trends developing; a greater frequency of scores
towards the left of the X-axis, ie the lower risk inspection scores,
could be used to demonstrate an overall improvement in compliance.
The various permutations can be developed into a comparison exercise
to further assess the authority's inspection profile, for example
the comparison of an individual officer's curve against that of
a general premises curve or with specific parishes. This would serve
to identify "hot-spots" or areas worthy of greater attention.
FUTURE DEVELOPMENTS
The various models could be used in the development of national
benchmarking standards and this could even form part of inter-authority
comparison as exists presently in other forms. The model cannot
replace sound management practice. However, it can be used as a
tool that contributes towards a more consistent enforcement regime,
capable of standing up to the scrutiny of audit. The findings of
such exercises, coupled with sound management practice and the pragmatic
application of government advice, should give managers greater confidence
in the work being undertaken by their staff in a challenging and
demanding climate.
This is an original piece of work copyrighted by Peter Foley and
Richard Kirk. No part of this work may be copied or used without
the written consent of the authors who can be contacted at Derbyshire
Dales District Council and North East Derbyshire District Council
respectively. The opinions expressed in this article are those of
the authors, and not necessarily those of their employing authorities.
The authors would like to acknowledge Andrew Hopkin, Derby City
Council.