I think your draft is an ambitious and very good start for the discussions. However, I have difficulties to understand some parts, especially the examples, but that is probably due to my lack of experience of the field of application. Perhaps some confusing parts (for a statistician) are a result of an attempt to achieve agreement with the GUM. I have developed a somewhat different strategy based on my approach to the validation of methods in analytical chemistry. However, I don't think my approach is suitable for general applications and, anyhow, I don't think there are any contradictions.
Below I have listed some specific comments and possible printing errors:
Something is missing between The and covers.
The number of measured objects is not the number of independent observations form the population of performance conditions and, thus, the standard deviation of the variability of the measurement process over time is estimated with only 1 degree of freedom (not Q). The same mistake is often made when two methods are compared.
It should perhaps be mentioned that a further source of errors is influence quantities belonging to measurement objects, for instance matrix effects in analytical chemistry. These influence quantities may give rise object related systematic errors which can't be eliminated by calibration.
See comment to 5.3.
Is the same bias assumed for all levels?
Perhaps I don't understand the example but is the observed bias a reliable estimate of the maximum bias?
Page 6, paragraph 2: Should it be 5 and not 6 wafers?
Is a sample of size 2 sufficient to estimate maximum bias?
Why N-1 and not N in the formula on page 7?
sz instead of sy in the list of notations.
In the list of examples, where covariance terms may be omitted, number 2 is a situation when it not acceptable to omit the covariance.
In the last two formulas in the table I think the covariance term should have plus sign.
In the last formula the left hand side should be divided by Y.
It is referred to procedures in section 5.3, but I can't find any procedures presented there.
How is run defined? In analytical chemistry a run is usually defined as a series of measurements evaluated with the same calibration curve.
Why must J be > or = N and K be > or = M?
In the first formula u must be squared.
For the examples (worst case, assumed distribution, not documented),
where the B estimates of uncertainty are assumed to have infinite
degrees of freedom, the uncertainty of the uncertainty estimates
is probably very large, i. e. they correspond to A estimates with
few degrees of freedom. Then it doesn't make sense to assume infinite
degrees of freedom for B estimates.