Archive - November 2000
A MODEL APPROACH
Back to contents

Peter Foley and Richard Kirk explain how a simple benchmarking model can assist managers in assessing the consistency of officer inspection rating scores

 

Accurate and reliable management controls are vital to any modern and progressive inspectorate. The model outlined in this article, a simple statistical technique, is designed to help managers within the environmental health profession achieve a precise and meaningful comparison of officer inspection rating scores and can be used to supplement existing management systems.

By applying techniques such as this, managers can be better placed to reduce the "guess work" associated with the inspection management and verification process, hopefully resulting in a more transparent and proportional approach.

BACKGROUND
The inspection rating of premises following routine inspections by officers is an issue that requires continual review and assessment by managers in order to determine and validate both accuracy and consistency. Numerous types of visits can be prioritised through the application of a suitable inspection rating score to determine the frequency of subsequent visits. As well as the nationally recognised systems under the Food Safety Act Code of Practice 9 and HELA LAC 67/1 (Revised), many other local schemes operate for work such as licensing and EPA visits.

Several potential mechanisms can be adopted in order to promote the consistency of inspection rating, all of which have their place in a modern inspection regime, including:

  • desk-top exercises;
  • accompanied inspections;
  • post inspection follow-up visits;
  • random checks on premises records; and
  • peer review.

There are, however, a number of limitations with all of these controls which prevent managers from obtaining an overall picture or from seeing how trends in inspection work may be developing. Legitimate differences may exist between staff at different levels, ie between newly qualified and experienced officers or between EHOs and technical support staff, which the above controls would not be able to examine. The initial aim of this project was to identify how individual officers rated premises in comparison with their peer group. This in turn highlighted the need to objectively and critically evaluate the whole issue of inspection rating management.

METHODOLOGY
Initially, using data from routine food hygiene inspections, a graph of each officer's annual inspection scores was constructed, which enabled a frequency distribution curve to be plotted for each officer. In order for the curves to be directly comparable, the distribution took account of the percentage of scores falling within specific rating bands, rather than merely comparing the number of inspections in each band.The study was further expanded across two authorities (A and B) to consider the issues with data obtained from 21 officers who had undertaken a total of 865 inspections.

To enable a true inter-authority comparison, a profile of the rating scores for all premises in each authority was first obtained. Because of the diversity of premises in any one district, the distribution curves of officers in different authorities could only be comparable if each authority had a similar profile of food premises, ie certain categories were not widely disproportionate. The total number of premises considered in the district profile was 2,391 (950 at authority A and 1,441 at authority B). The exercise was useful in its own right to enable individual comparisons between officers at each authority. The scores for both EHOs and technical officers at each separate authority were then averaged to allow a meaningful inter-authority comparison.

RESULTS
The frequency curves largely follow a normal distribution. The general trend of the officer's curves shows consistency within peer groups, and that EHOs are inspecting the higher risk premises as expected. This demonstrates that there is an extremely good correlation between the percentage of food premises within each of the rating bands in each of the authorities. If the types of premises are largely comparable, it follows that the pattern of rating individual premises by the officers of each authority might also be similar. With the average rating scores for the EHOs in each authority, again there is generally a good correlation, except for scores between 80 and 100, where officers from authority A rated more frequently. The mean inspection score for EHOs was 64.

With the average rating scores for technical officers of both authorities, the mean score of 52 shows that technical officers are generally responsible for inspections at the lower end of the risk range. This reflects the fact that technical officers without the Higher EHORB Certificate generally inspect lower risk premises in categories C and below. The main differences between the average technical officer scores of the authorities are that over 30 per cent of authority B inspections have a hazard rating of 50, compared with only 10 per cent at authority A. Only 5 per cent of authority B inspections have a hazard rating of 60 compared with 20 per cent at authority A. These differences raise a number of important questions about the way in which these technical officers are working. The individual frequency curves created for a number of experienced EHOs working in both authorities showed a fairly consistent pattern.

A number of atypical frequency curves can also be seen. EHO 1 was employed on a project basis in one of the authorities to inspect a specific group of category B premises. EHO 2 was newly qualified and had not been given authority to inspect premises in risk categories A and B, the curve is consistent with this. The exercise was especially useful in highlighting an anomaly with the output of one officer in particular, whose scoring was dramatically different to that of the peer group. The curve for EHO 3 serves to illustrate how staff may be apportioning inaccurate scores, or selectively choosing to inspect premises of lower risk. The curve for EHO 4, an officer from an external agency, also peaks in the lower risk categories. This raises the question as to whether these scores truly represented those premises inspected or whether the scoring was consistently lower than that of other EHOs?

OBSERVATIONS
Having succeeded in the original objective of comparing officer scoring within a peer group, it then became apparent that the resultant frequency distribution model had a wider application for a whole range of benchmarking purposes. These include:

  • assessment of officers on temporary short-term contracts;
  • comparing the scoring of similar groups, ie EHOs and technical officers;
  • giving positive feedback to staff confirming the accuracy of their assessments;
  • looking at longer-term inspection trends affecting the department; and
  • comparing ratings between officers at different authorities.

Such a model is dependent on the availability and accuracy of the chosen data ranges over time. This will be dependent on the type of computer software used and its flexibility in reporting. Where officer discrepancies become apparent, the question arises as to whether the score should be modified by the manager, so as to bring one officer's scoring back into line with their peers. This may be particularly relevant if irregularities have been taking place for some time and are not isolated errors. The entire inspection programme could be jeopardised by careless or malicious instances of inconsistent scoring. There may be limitations to the use or application of the model, which should be given consideration by those wishing to introduce similar monitoring mechanisms, for example:

  • local anomalies with IT;
  • data anomalies or inaccuracies;
  • influence of transient or agency type staff;
  • statistical limitations and interpretation;
  • changes to the scoring regime in the codes of practice could make former data incompatible over time; and
  • differences in authorities where a "figures driven" approach to inspection is favoured above a qualitative style.

EXPANDING THE MODEL
The frequency distribution principal illustrated above can be easily adapted to produce a range of other assessment profiles such as:

  • district profile, ie hazard by Ward or Parish;
  • risk category profile, ie expected or desired curve for particular operations;
  • specific types of premises (including those subject to formal action), specific MAFF and HSW categories;
  • by linking to outcomes, for example confidence in management; and
  • longer-term assessment of management scores, illustrating "success" or "failure".

Many of these examples could be developed as a national or county-wide benchmark standard by which more accurate and meaningful auditing could take place locally, for example comparing the rating profile of all school kitchens across a county area. Over time it may be possible to see trends developing; a greater frequency of scores towards the left of the X-axis, ie the lower risk inspection scores, could be used to demonstrate an overall improvement in compliance. The various permutations can be developed into a comparison exercise to further assess the authority's inspection profile, for example the comparison of an individual officer's curve against that of a general premises curve or with specific parishes. This would serve to identify "hot-spots" or areas worthy of greater attention.

FUTURE DEVELOPMENTS
The various models could be used in the development of national benchmarking standards and this could even form part of inter-authority comparison as exists presently in other forms. The model cannot replace sound management practice. However, it can be used as a tool that contributes towards a more consistent enforcement regime, capable of standing up to the scrutiny of audit. The findings of such exercises, coupled with sound management practice and the pragmatic application of government advice, should give managers greater confidence in the work being undertaken by their staff in a challenging and demanding climate.

This is an original piece of work copyrighted by Peter Foley and Richard Kirk. No part of this work may be copied or used without the written consent of the authors who can be contacted at Derbyshire Dales District Council and North East Derbyshire District Council respectively. The opinions expressed in this article are those of the authors, and not necessarily those of their employing authorities. The authors would like to acknowledge Andrew Hopkin, Derby City Council.