1st Edition, January 1, 1990

Studies toward the unification of picture assessment methodology

Active, Most Current

Recommendation 500 has been prepared, and is regularly reviewed. to provide instructions on what seem the best available methods for the assessment of picture quality in a controlled laboratory environment. The methods need to be reviewed at jntervals, to reflect the evolution of studies in new systems, and to reflect the evolution of methodology itself.

Although the methods outlined in Sections 2 and 3 of Rec 500 have been carefully considered and designed with the knowledge available, they are not free of shortcomings. If new alternative methods are designed and proven to be free of them, they must be candidates to supercede the existing methods.

The main drawbacks of the methods currently given in Sections 2 and 3 are as follows:

— The conceptual differences between the meanings of the quality scale descriptors is not necessarily uniform. It is known to vary between linguistic groups, cultural groups, and between individuals, to a nonnegligeable extent. Processing of results is currently based on the approximation that the conceptual difference is uniform; so, interpreting results to indicate a consistent measure of absolute quality or impairment is also an approximation. In fact, results could even misrepresent the magnitudes of differences by as much as +50%.

— For reasons which may include the differences in meanings associated with descriptors mentioned above, the correlation of results between laboratories is not considered sufficiently good for alternative systems with small impairments or high-quality, to be reliably evaluated in different laboratories, and the absolute results compared. Rank order is consistent, however.

— The stability of the methods in Sections 2 and 3 of Rec. 500 derives in part from the systematic use of a high-quality reference. There are circumstances where a high-quality reference is not available; and, in these cases, the methods cannot be used.

— Double stimulus methods take more than twice as much time as single stimulus methods and thus are accordingly more expensive to conduct.

This report describes studies related to the development of new information and to overcome or circumvent the shortcomings mentioned above. The general areas being studied are as follows:

— ratio scaling (numerical magnitude estimation of quality)

— graphic scaling (evaluation of conceptual differences in descriptors)

— numerical category scaling

— multi-dimensional scaling

— pair comparing

— visibility threshold measurement.

To become candidates for inclusion in Recommendation 500, methods must be fully developed and provide significant advantages compared to currently recommended methods.

This report also describes recent work intended to examine whether it is possible that impairments such as noise can be assessed using graphic scaling.