Back -- Table of Contents -- Forward


Section III
Thyroid Tests for the
Clinical Biochemist and Physician


A. Total Thyroxine (TT4) and Total Triiodothyronine (TT3) methods

B. Free Thyroxine (FT4) and Free Triiodothyronine (FT3) tests

C. Thyrotropin/Thyroid Stimulating Hormone (TSH) measurement

D. Thyroid Autoantibodies:

Thyroid Peroxidase Antibodies (TPOAb)

Thyroglobulin Antibodies (TgAb)

Thyrotrophin Receptor Antibodies (TRAb)

E. Thyroglobulin (Tg) Measurement

F. Urinary Iodine Measurement

G. Thyroid Fine Needle Aspiration Biopsy and Cytology

H. Screening for Congenital Hypothyroidism


A. Total Thyroxine (TT4) and Total Triiodothyronine (TT3) Methods

Thyroxine is the principal hormone secreted by the thyroid gland. In contrast, most (~80%) of the T3 in blood is derived outside the thyroid by 5'-monodeiodination of T4 in tissues. In the circulation, most (~99.98%) T4 molecules are bound to the plasma proteins, thyroxine-binding globulin (TBG), TTR/TBPA (transthyretin) and albumin (4). Protein-bound hormone is considered to be biologically inert. In contrast, the more metabolically active thyroid hormone, T3 is ten-fold more weakly protein-bound than T4, with ~99.7% being protein-bound, primarily to TBG (4). Technically, it has been easier to develop methods to measure total (free + protein-bound) thyroid hormone concentrations, where the bound hormones are released from binding proteins by a displacing or blocking agent before measurement. This is because total hormone concentrations (TT4 and TT3) are measured in the nanomolar range, whereas free hormone concentrations (FT4 and FT3) are measured at picomolar levels.

(a) Methods for measuring Total Thyroid Hormones (TT4 and TT3)

Serum total T4 and total T3 measurements (TT4 and TT3) have evolved through a variety of technologies over four decades. The PBI tests of the 1950s that estimated the TT4 concentration were replaced first by competitive protein binding methods in the 1960s, which were later superseded by RIA methods in the 1970s. Currently, serum TT4 and TT3 concentrations are measured either by competitive immunoassay or non-competitive immunometric assay (IMA) methods that use radioactive iodine, enzyme, fluorescence or chemiluminescence as the signal. Total hormone measurements require the inclusion of an inhibitor (displacing or blocking agent) such as 8-anilino-1-napthalene-sulphonic acid (ANS), or salicylate etc. These agents block hormone-binding to serum proteins and thereby facilitate the binding of the hormone to the antibody reagent. The ten-fold lower TT3 concentration, compared with TT4, presents both a sensitivity and precision challenge, despite the use of higher specimen volumes. High-range serum TT3 measurements are used most frequently to diagnose unusual presentations of hyperthyroidism, however, reliable normal-range measurement is also important for detecting hyperthyroidism in sick hospitalized patients, in whom a paradoxically normal T3 value is a useful indicator of hyperthyroidism. As shown in Figures 3 and 4, there are significant between-method biases between different TT4 and TT3 methods. These biases are greater than the maximum allowable bias calculated from the within and between-person variability of these hormones (Appendix B). A number of factors are likely responsible for these large inter-method biases. Firstly, although highly purified preparations of crystalline L-thyroxine and L-triiodothyronine are readily obtained (i.e. from the United States Pharmacopoeia (16201 Twinbrook Parkway, Rockville, MD 20852), the hygroscopic nature of the crystalline preparations can affect the accuracy of gravimetric weighing. Secondly, the diluents used to reconstitute L-T4 and L-T3 preparations for calibrators are either modified protein matrices or human serum pools that have been stripped of hormone by various means. In either case, the protein composition of the matrix used for the calibrators is not identical to patient sera, such that the protein binding inhibitor reagent (e.g. ANS) may release different quantities of hormone from calibrator matrix proteins than from the TBG in patient specimens.

TT4 and TT3 Method Differences Relate to:
  • The lack of international L-T4 and L-T3 reference preparations.
  • Instrument sensitivity to differences between human serum and the calibrator matrix.
  • Differences in efficiency of thyroid hormone release from serum proteins, versus release from the proteins in the calibrator diluent.


Figure 3. Serum TT4 Measurement by Different Manufacturers' Methods

(b) Diagnostic Accuracy of Total Hormone Measurements

The diagnostic accuracy of total hormone measurements would equal that of free hormone if all patients had identical levels of binding proteins with similar affinities for the thyroid hormones. Unfortunately, abnormal serum TT4 and TT3 concentrations are commonly encountered as a result of binding protein abnormalities and not thyroid dysfunction. Patients with serum TBG abnormalities secondary to pregnancy or oral contraceptive use, as well as genetic abnormalities in T4-binding proteins are frequently encountered in clinical practice. Abnormal TBG concentrations and/or affinity for thyroid hormone will distort the relationship between total and free hormone measurements. Additionally, some patient sera contain other abnormal binding proteins, such as autoantibodies to thyroid hormones that render total hormone measurements diagnostically unreliable. These binding protein abnormalities compromise the use of TT4 and TT3 measurements as stand-alone thyroid tests. Instead, serum TT4 and TT3 measurements are usually made as part of a two-test panel that includes an assessment of binding protein status, made either directly by TBG immunoassay or by an indirect "uptake" test. A mathematical relationship between the total hormone concentration and the "uptake" result is used as a free hormone "index" [see Section III B (b)] (33). Free hormone indexes (FT4I and FT3I) have been used as free hormone estimate tests for three decades.

Warning: Abnormal serum TT4 and TT3 concentrations are commonly encountered as a result of binding protein abnormalities and not thyroid dysfunction.

Figure 4. Serum TT3 Measurement by Different Manufacturers' Methods

(c) Serum TT4 and TT3 Normal Reference Ranges

As shown in Figure 3, serum TT4 values vary to some extent between methods. Reference ranges approximate to 58 - 160 nmol/L (4.5-12.6 g/dl). Likewise, as shown in Figure 4, serum TT3 values are also method dependent, with normal reference ranges approximating to 1.2 - 2.7 nmol/L (80 –180 ng/dl).


B. Free Thyroxine (FT4) and Free Triiodothyronine (FT3) Tests

Thyroxine in the circulation is more tightly bound to serum proteins than is T3, so consequently, less is free (0.02% versus 0.2%, respectively). Unfortunately, the techniques for physically separating the minute free hormone fraction from the dominant protein-bound moieties are too technically demanding, inconvenient and expensive for routine clinical laboratory use. Thus, the direct absolute methods that employ physical separation of free from bound hormone (i.e. equilibrium dialysis, ultrafiltration and gel filtration) are usually only available in reference laboratories. In contrast, routine clinical laboratories use a variety of free hormone tests that can only estimate free hormone concentration in the presence of protein-bound hormone. These free hormone estimate tests employ either a two-test strategy for calculating a free hormone "index" [see Section III B(b)], or use a variety of immunoextraction approaches (6). In reality, despite manufacturers claims, most if not all FT4 and FT3 estimate tests are binding-protein dependent to some extent (34). Some methods are sensitive to in-vivo or in-vitro effects of certain drugs, high free fatty acid (FFA) levels or endogenous inhibitors of hormone binding to proteins that are present in certain pathological conditions and may cause inappropriately abnormal results in sick euthyroid patients with NTI [see Section IIB].

(a) Nomenclature of FT4 and FT3 Methods

Considerable confusion surrounds the nomenclature of the free hormone tests and controversy continues regarding the technical validity of the measurements themselves and their clinical utility in conditions associated with severe binding protein abnormalities, especially NTI. Free hormone tests (FT4 and FT3) are made either by indirect index methods that require two separate tests, or direct immunoassay techniques. The direct immunoassay methods can be sub-classified on the basis of their standardization, which is either absolute, i.e. use standard solutions containing a gravimetrically established concentration of hormone, or comparative, i.e. use calibrators with values assigned by an absolute method. Direct absolute methods are the manual, technically demanding reference techniques that physically separate free from protein-bound hormone and are too expensive for routine clinical use. Index and comparative FT4 and FT3 tests are typically used in the routine clinical laboratory setting, where they are usually automated on immunoassay analyzers. Unfortunately, a confusing plethora of terms have been used to distinguish the different free hormone methodologies and the literature abounds with inconsistencies in the nomenclature of these tests. Currently, there is no clear methodologic distinction between terms such as "T7", "effective thyroxine ratio", "one-step", "analog", "two step", "backtitration", "sequential" and "immunoextraction", because manufacturers have modified the original techniques or adapted them for automation. Following the launch of the original one-step "analog" tests in the 1970s, the term "analog" became mired in confusion. This first generation of hormone-analog tests were shown to be severely binding-protein dependent and have since been replaced by a new generation of labeled-antibody "analog" tests which are more resistant to the presence of abnormal binding proteins. Unfortunately, manufacturers rarely disclose all assay constituents or the number of steps involved in an automated procedure so that it is not possible to use the method's nomenclature (two-step, analog etc) to predict its diagnostic accuracy for assessing patients with binding protein abnormalities.

(b) Indirect Index Methods (FT4I and FT3I)

Indexes are free hormone estimate tests that require two separate measurements (33). One test is a total hormone measurement (TT4 or TT3) and the other, an assessment of binding protein concentration using either TBG immunoassay, Thyroid Hormone Binding Ratio (THBR) or "uptake' test, or an estimate of the free hormone fraction determined by isotopic dialysis or ultrafiltration. Tracer purity critically impacts the diagnostic accuracy of indexes calculated using an isotopic signal.

Indexes using TBG Measurement

Free T4 indexes calculated with direct TBG measurements are only more diagnostically accurate than total hormone if the TT4 concentration is abnormal as a result of a TBG abnormality. Further, the TT4/TBG index approach is not fully TBG independent, nor does it correct for non TBG-related binding protein abnormalities. Despite the theoretical advantages of using direct TBG measurements, TT4/TBG indexes are rarely used because they are not diagnostically superior to indexes employing indirect THBR tests.

Indexes using Thyroid Hormone Binding Ratio (THBR) or "Uptake" Tests

The first "T3 uptake" tests developed in the 1950s used the partitioning of T3 tracer between the plasma proteins in the specimen and an inert scavenger (red cell membranes, talc, charcoal, ion-exchange resin or antibody). The scavenger "uptake" of T3 tracer that did not become bound to the binding proteins in the specimen, was an indirect, reciprocal estimate of the TBG concentration of the specimen. Initially, T3 uptake tests were reported as percent tracer uptake (free/total counts). Normal serum specimens containing normal TBG concentrations typically exhibited T3 tracer uptakes varying from 25 to 40%, depending on the manufacturer. Traditionally, uptakes measured unsaturated binding sites and calculated the free T4 index (FT4I) from the product of TT4 and the uptake test. Current uptake tests use free/total minus scavenger-bound counts and sometimes measure the total binding capacity of the serum sample and calculate the index from TT4/Uptake (33). This approach is said to provide a better correction for overall T4 protein-binding in the specimen.

Historically, the use of a 125I-T3, as opposed to a 125I-T4 tracer, was made for practical reasons, namely that the lower T3-protein binding resulted in a greater percentage of free T3 tracer being available for the scavenger to pick up, and consequently, shorter gamma-counting times. With the advent of non-isotopic technology, labeled T4 has become the preferred signal for THBR tests since it is a better indicator of T4 binding protein effects. Current THBR tests usually produce normal FT4I and FT3I values when TBG abnormalities are mild (i.e. pregnancy). However, these tests may still produce inappropriately abnormal FT4I and FT3I values when patients have grossly abnormal binding proteins (congenital TBG extremes, familial dysalbuminemic hyperthyroxinemia (FDH), thyroid hormone autoantibodies and NTI).

Indexes using Measurements of Free Hormone Fraction

The first free hormone tests developed in the 1960s were indexes calculated from the product of the free hormone fraction of a dialysate, and a TT4 measurement (made by PBI and later RIA). The free fraction index approach was later extended to measure the rate of transfer of isotopically-labeled hormone across a membrane separating two chambers containing the same undiluted specimen. These techniques eliminated the dilution effects that had been shown to influence tracer dialysis values. The free hormone indexes calculated with isotopic free fractions are not completely independent of TBG concentration and furthermore, are influenced by tracer purity and the buffer matrix employed.


(c) Direct FT4 and FT3 Methods

These methods fall into two categories – absolute and comparative methods -- the distinction being based on the method's standardization (35).

Absolute FT4 and FT3 Methods

Absolute methods require a physical isolation of free from protein-bound hormone before employing a sensitive immunoassay to measure the free hormone concentration relative to gravimetrically prepared standard solutions. The physical isolation of free from protein-bound hormone is accomplished with either a semi-permeable membrane using a dialysis chamber, an ultrafiltration technique, or alternatively, by a Sephadex LH-20 resin adsorption column. An exceedingly sensitive T4 RIA method is needed to measure the picomolar concentrations of FT4 in dialysates or free fraction isolates, as compared with total hormone measurements that are made in nanomolar range. Although there are no officially acknowledged "gold standard" free hormone methods, it is generally considered that direct absolute methods are least influenced by binding proteins, and by inference, provide free hormone values that best reflect circulating free hormone status (27). However, these methods are too labor intensive and expensive for use in the routine clinical laboratory setting. When it is critical to determine the thyroid status of a patient in whom routine FT4 estimate tests and serum TSH values are diagnostically discordant, the specimen may be sent to a reference laboratory which offers a direct absolute FT4 test. Direct absolute FT3 methods are only available in some specialized research laboratories (28).

Comparative FT4 and FT3 Immunoassay Methods

Direct comparative immunoassays use a specific hormone antibody to sequester a small amount of the total hormone. The antibody occupancy, which is usually inversely proportional to the free hormone concentration, is quantified as labeled hormone carrying radioactivity, fluorescence- or a chemiluminescence-generating reagent. The assay output is then converted to a free hormone concentration using calibrators with free hormone values assigned by a direct absolute method. The actual proportion of total hormone sequestered (up to 5%) varies with the methodologic design, but greatly exceeds the actual free hormone concentration. The active sequestering of hormone by the anti-thyroid hormone antibody reagent in the assay results in a continuous re-equlibration of the bound/free equilibrium. The key to the validity of these methods is twofold. Firstly, it is necessary to use conditions that maintain the free to protein-bound hormone equilibrium, and minimize dilution effects that weaken the influence of any endogenous inhibitors present in the specimen. Secondly, it is important to use serum calibrators containing known free hormone concentrations that behave in the assay in an identical manner to the patient specimens. Three general approaches have been used to develop comparative FT4 and FT3 immunoassay methods: (i) two-step labeled-hormone; (ii) one-step labeled-analog; and (iii) labeled antibody.

(i) Two-Step, Labeled-Hormone/Back-Titration Methods

This approach was first developed in the late 1970s and subsequently adapted to commercial FT4 and FT3 methods. Two-step methods typically employ anti-hormone antibody bound to a solid support (ultrafine Sephadex, tube or particles). The solid-phase, high affinity (>1x1011 L/mol) antibody sequesters a small proportion of total hormone from a diluted serum specimen during the first incubation step. Unbound serum constituents are washed away before an addition of labeled hormone that is taken up by the unoccupied antibody-binding sites during a second incubation step. After washing, the amount of labeled hormone bound to the solid-phase antibody is quantified relative to calibrators that have free hormone values assigned by a direct absolute method. Following the introduction of the less labor-intensive, one-step labeled hormone-analog methods in the 1970s, two-step techniques lost popularity despite comparative studies showing that they were less affected by the binding protein abnormalities that plagued most one-step analog methods (36). Currently, some manufacturers claim that their methods are based on the two-step approach.

(ii) One-Step, Labeled Hormone-Analog Methods

The physicochemical validity the one-step labeled hormone-analog hinged upon the development of a hormone analog with a molecular structure that was totally non-reactive with serum proteins, but could react with unoccupied hormone antibody sites. If these conditions are met, the hormone-analog, which is chemically coupled to a signal molecule such as an isotope or enzyme, can compete with free hormone for a limited number of anti-hormone antibody binding sites in a classic competitive immunoassay format. Though conceptually attractive, this approach is technically difficult to achieve in practice, despite early claims of success. The hormone-analog methods were principally engineered to give normal FT4 values in high TBG states (i.e. pregnancy). They were found to have poor diagnostic accuracy in the presence of abnormal albumin concentrations, FDH, NTI, high FFA concentrations or thyroid hormone autoantibodies. Considerable efforts were made during the 1980s to correct these problems by the addition of propriety chemicals to block analog binding to albumin or by empirically adjusting calibrator values to correct for protein-dependent biases. However, after a decade of criticism, most hormone-analog methods have been abandoned because these problems could not be resolved (36).

(iii) Labeled Antibody Methods

Labeled antibody methods also measure free hormone as a function the fractional occupancy of hormone-antibody binding sites. This competitive approach uses specific immunoabsorbants to assess the unoccupied antibody binding sites in the reaction mixture. A related approach has been the use of solid-phase unlabeled hormone/protein complexes that do not react significantly with serum proteins (i.e. unlabeled hormone "analogs") to quantify unoccupied binding sites on anti-hormone antibody in the liquid-phase. The physiochemical theory of these analog-based labeled-antibody methods suggests that they would be as susceptible to the same errors as the older labeled-hormone analog methods. However, physicochemical differences arising from the binding of analog to the solid support confers kinetic differences that result in decreased analog affinity for endogenous binding proteins and more reliable free hormone measurements. The labeled antibody approach is becoming the favored technique for automated free hormone testing.

Figure 5. Serum FT4 Measurement by Different Manufacturers' Methods

(d) Clinical Performance of FT4 and FT3 Tests

The only reason to select a free hormone method (FT4 or FT3) in preference to a total hormone test (TT4 or TT3) is to improve diagnostic accuracy for detecting hypo- and hyperthyroidism in patients with thyroid hormone binding abnormalities that compromise the diagnostic utility of total hormone measurements. Unfortunately, the diagnostic accuracy of current free hormone methods cannot be predicted from either their methodologic classification (one-step, two-step, labeled antibody etc) or an in vitro test of technical validity, such as specimen dilution studies. The indirect index tests (FT4I and FT3I) as well as current direct comparative methods, are all protein dependent to some extent, and many are prone to give unreliable values in hospitalized patients with NTI. Unfortunately, manufacturers rarely extend their method validations beyond a study of ambulatory hypo- and hyperthyroid, pregnant patients and a catchall category of "NTI/hospitalized patients". Ideally, the diagnostic accuracy of all FT4 methods should be tested using pedigreed specimens from patients with the clinical conditions known to be associated with binding protein disturbances. In the ambulatory patient setting these groups should include: a) TBG abnormalities (pregnancy, oral contraceptive therapy, and congenital TBG excess and deficiency); b) Familial Dysalbuminemic Hyperthyroxinemia (FDH); c) T4 and T3 autoantibodies and d), interfering substances such as rheumatoid factor and heterophilic antibodies (HAMA). Accurate FT4 tests are needed especially in the hospital setting where NTI and drug therapies (i.e. glucocorticoids and dopamine) weaken the diagnostic value of serum TSH measurement. Three classes of hospitalized patients should be tested: a) patients without thyroid dysfunction but with low or high TT4 due to NTI; b) patients with documented hypothyroidism associated with severe NTI and, c) patients with documented hyperthyroidism associated with NTI. Since few, if any manufacturers have tested their methods with these patient groups, it is difficult for a physician to know the diagnostic reliability of FT4 test made in such patients. Although the thyroid status of most ambulatory and hospitalized patients is usually appropriately reflected by the serum TSH concentration, in the rare cases the assessment of the thyroid status may require that a reference laboratory measure FT4 by a direct absolute technique, such as equilibrium dialysis or ultrafiltration.

Figure 6. Serum FT3 Measurement by Different Manufacturers' Methods

(e) Interferences with Thyroid Hormone Tests

Ideally, a thyroid hormone test should display zero interference by any compound in any sera at any concentration. Studies available from manufacturers vary widely in the number of compounds studied and in the concentrations used. Usually the laboratory can only proactively detect interference by performing a "sanity check" of the relationship between the FT4 and TSH result. When both tests are not measured on the specimen, interference is first suspected by the physician from an inconsistency between the reported test value and the clinical status of the patient. Classic laboratory checks of analyte identity, such as dilution, may not always detect interference. Most interferences with TT4 and TT3 measurements, as well as the FT4 and FT3 estimate tests, cause inappropriately abnormal values in the presence of a normal serum TSH concentration. (See section IIIC for a discussion of TSH/FT4 discordances). Interferences in competitive immunoassays as well as non-competitive IMAs fall into four classes: (i) cross-reactivity problems, (ii) endogenous analyte antibodies, (iii) heterophilic antibodies and (iv) drug interactions (37).

(i) Cross-reactivity problems result from the inability of the antibody reagent to discriminate flawlessly between analyte and a structurally related molecule (38). Thyroid hormone assays are less susceptible to this type of interference than TSH, because iodothyronine antibody reagents are selected for specificity by screening with purified preparations. The availability of monoclonal and affinity-purified polyclonal antibodies has reduced the cross-reactivity of current T4 and T3 tests to less than 0.1% for all studied iodinated precursors and metabolites of L-T4.

(ii) Endogenous autoantibodies to both T4 and T3 have been frequently found in sera from patients with autoimmune thyroid and non-thyroid disorders. Despite the high prevalence, interference caused by such autoantibodies is relatively rare. Such interferences are characterized by either falsely low or falsely high values, depending on the type and composition of the assay used (39).

(iii) Heterophilic antibodies fall into two classes (40). They are either relatively weak multispecific, polyreactive antibodies that are frequently IgM rheumatoid factor. These are often called human anti-mouse antibodies (HAMA). Alternatively, they are specific human anti-animal immunoglobulins (HAAA) that are produced against well-defined specific antigens following exposure to a therapeutic agent containing animal antigens (i.e. murine antibody) or by coincidental immunization through workplace exposure (i.e. animal handlers). Either HAMA or HAAA affect IMA methodology more than competitive immunoassays by forming a bridge between the capture and signal antibodies, thereby creating a false signal, resulting in an inappropriately high value (41).

(iv) Drug interferences can result from the in-vitro presence of therapeutic or diagnostic agents in the serum specimen in sufficient concentrations to interfere with the thyroid test (19). Alternatively, a therapeutic or diagnostic agent may have an in-vivo effect that disturbs the binding equilibrium of the thyroid hormones to the carrier proteins. In this case, a free hormone assay will detect and record this disturbance, first as an initial elevation of free hormone level due to less available carrier protein, then a drop to a normal FT4 level (with a low TT4) as the body re-establishes the "normal" equilibrium. Of course, if the drug is withdrawn, the FT4 falls because more carrier protein is available and TT4 is low. Again, the FT4 will normalize as the equilibrium is re-established through release of more hormone from the thyroid. Alternatively, in the case of heparin, in-vitro stimulation of lipoprotein lipase can liberate FFA, which inhibits T4 binding to serum proteins and may raise the T4 value.

(f) Serum FT4 and FT3 Normal Reference Ranges

Direct absolute methods are used to assign values to the calibrators used for the comparative tests. There is closer agreement between the reference ranges of the various comparative immunosequestration methods used by clinical laboratories that there is between direct absolute methods. As shown in Figure 5, reference ranges for comparative FT4 methods approximate to 9-23 pmol/L (0.7 –1.8 ng/dL). In contrast, the upper FT4 limit for direct absolute equilibrium dialysis tests extends above 30 pmol/L (2.5 ng/dL). As shown in Figure 6, reference ranges for comparative FT3 methods approximate to 35-77 pmol/L (23 –50 pg/mL).


For Manufacturers:

Ambulatory patients:

Hospitalized patients:

For Laboratories:

For Physicians:


C. Thyrotropin/Thyroid Stimulating Hormone (TSH)

For more than twenty-five years, TSH methods have been able to detect the elevated levels characteristic of primary hypothyroidism. What characterizes modern-day methods, is their enhanced sensitivity that now allows detection of hyperthyroid conditions. Most current TSH methods are based on non-isotopic IMA methodology that is automated on a variety of instrument platforms that mostly achieve a functional sensitivity of 0.02mU/L or less. Such methods can distinguish the profound TSH suppression typical of severe Graves' thyrotoxicosis (TSH < 0.01 mU/L) from minor TSH suppression (0.01 – 0.1 mU/L) characteristic of mild (subclinical) hyperthyroidism and acute non-thyroidal illness (NTI). The diagnostic strategy for using TSH measurement has changed as a result of these sensitivity improvements. It is now recognized that TSH is more sensitive than FT4 for detecting both hypo- and hyperthyroidism. As a result, some countries now promote a TSH-first strategy for detecting thyroid dysfunction in ambulatory patients, provided that the TSH method has a functional sensitivity of 0.02 mU/L or less. Other countries still favor the TSH + FT4 panel approach, because the TSH-first strategy can miss patients with central hypothyroidism. Additionally, a TSH-centered strategy does not allow the TSH/FT4 relationship to be used as a "sanity check" for interferences or unusual conditions characterized by TSH/FT4 discordances [see below].

(a) Specificity

TSH is a heterogeneous molecule and there are differences between TSH isoforms in blood, the pituitary extracts used for assay standardization (MRC 80/558) and recombinant human TSH (rhTSH) preparations that may be phased in as a standard in the future (42). Current TSH IMA methods use TSH monoclonal antibodies to eliminate the cross-reactivity with other glycoprotein hormones. These methods differ in epitope specificity for the abnormal TSH isoforms secreted by some normal euthyroid subjects, as well as patients with some disease conditions. For example, patients with central hypothyroidism caused by pituitary or hypothalamic dysfunction, secrete TSH isoforms with abnormal glycosylation and reduced biological activity that are measured as paradoxically normal serum TSH concentrations by most methods (43). Likewise, paradoxically normal serum TSH concentrations may be seen when patients have hyperthyroidism due to TSH-secreting pituitary tumors, which appear to secrete TSH isoforms with enhanced biologic activity (44).

Most interferences with TSH IMA measurements result in a falsely high result by forming a bridge between the capture and signal antibodies, thereby creating a false signal that results in a falsely high value being reported. The erroneous value may not necessarily be abnormal, but may be inappropriately normal. Heparin contamination of the specimen can result in an erroneously low serum TSH value, whereas heterophilic antibodies (HAMA) are the most common cause of erroneously high serum TSH values. Such antibodies cross the placenta and have the potential to influence a neonatal screening result. Conventional laboratory verification of analyte identity such as dilution may not always detect an interference problem. Methods vary in their susceptibility to most interfering substances. A practical way to test for interference is to measure TSH in the specimen by a different manufacturer's method, and check for a significant discordance between the TSH values of the different methods. Figure 7 shows the expected variability between methods. When the variability of TSH measurements made in the same serum with different methods exceeds these expectations, interference may be present. In difficult cases, the physician can perform a TRH-stimulation test or a thyroid hormone suppression test (1mg L-T4 or 200g L-T3, po) as a biological check on the validity of the TSH result. Patients with sera containing interfering substances that produce erroneously high values will typically show less than the expected three-fold rise in TSH following TRH, and a blunted suppression response following thyroid hormone administration (<90% suppression of basal TSH by 48 hours) (45).

Recommendations for Physicians:

Discordance between a serum TSH result and the clinical status of the patient may indicate a specificity problem (due to an interfering substance like HAMA or an unusual TSH isoform). The physician can:

  • Ask lab to confirm specimen identity.
  • Remeasure specimen at dilution to check parallelism.
  • Request laboratory to send specimen for analysis by a different manufacturer's method.
  • Perform a TRH stimulation or thyroid hormone suppression test.

(b) Sensitivity

Historically, the "quality" of a serum TSH assay has been judged from a clinical benchmark - the assay's ability to discriminate euthyroid concentrations (~ 0.3 to 4.0 mU/L) from the profoundly low (<0.01 mU/L) TSH concentrations typical of overt Graves' thyrotoxicosis. Most TSH methods now claim a detection limit of 0.02 mU/L or less ("third generation"). Manufacturers have largely abandoned the "analytical sensitivity" parameter that is calculated from the within-run precision of the zero calibrator. Instead, a "functional sensitivity" parameter has been adopted. Functional sensitivity is calculated from the 20% between-run coefficient of variation (CV) and is used to establish the lower reporting limit for the test (46). Functional sensitivity should be determined by using the recommended protocol described below. An assay detection limit based on functional sensitivity is more clinically relevant than a detection limit based on analytical sensitivity. The use of functional sensitivity is the conservative approach for setting the assay detection limit, since it ensures that any TSH result reported is not merely assay "noise". Further, the 20% between-run CV approximates to the maximum imprecision needed for diagnostic testing, as calculated in Appendix B.

Recommendation for Manufacturers and Laboratories:

Use functional sensitivity (20 % between-run CV, established by the protocol below) to determine the lower detection limit for TSH reporting.

Functional Sensitivity

It is necessary to follow a strict protocol for determining functional sensitivity to ensure that the parameter realistically represents the lower detection limit. This protocol is designed to take into account the variety of factors that can influence TSH methodologic precision in clinical practice such as:


Measure low, medium and high human serum pools in ten different assay runs. The low and high values should be representative of the thyroid disease states encountered in clinical practice (i.e. low TSH between 0.02 and 0.05 mU/L, high TSH > 30 mU/L), the medium should be close to the normal reference mean (1.5 mU/L).

  • Analyze pools in order: medium then high followed by low, to detect any carryover.
  • Use same test mode as used for patients (i.e. singlicate or duplicate).
  • The instrument operator should not know that pool materials are included in the run.
  • Runs should be spaced over a clinically representative interval for the test in question
  • (6 to 8 weeks for TSH in an outpatient setting).
  • Use at least two different lots of reagents and two different instrument calibrations during the test period.

(c) Reference Range

Serum TSH exhibits a diurnal variation with the peak occurring during the night and the nadir, which approximates to 50% of the peak value, occurring between 1000 and 1600 hours. This biologic variability does not, however, influence the diagnostic use of the test, since most clinical TSH measurements are made on ambulatory patients between 0800 and 1800 hours. Serum TSH reference ranges should be established using specimens from TPOAb-negative ambulatory euthyroid subjects who have no personal or family history of thyroid dysfunction and no visible goiter. As shown in Figure 7, the between-method variability close to the upper reference limit (~4.0 mU/L) approximates to 12%, whereas there is a much higher variability in method-to-method variability and within-method precision when measuring subnormal range values (CV 65% for values < 0.1 mU/L). Variability in normal reference limits for different methods reflects both inter-method bias and the rigor applied to the selection of normal subjects.

Ambulatory Subjects. Normal serum TSH concentrations are log–distributed, so that arithmetic values are skewed with a relatively long "tail" toward the higher values of the distribution. This distribution can be changed to a normal distribution by converting the values to their logarithms thus permitting calculation of a 95% reference range (typical population mean value ~1.5 mU/L, range 0.3 to 4.0 mU/L) (47). Although adult populations show slight age-related and ethnic-related differences in TSH (NHANES III US survey) it is not considered necessary to adjust the reference range for these factors in clinical practice (48).

Hospitalized Patients. Transient abnormalities are commonly encountered in hospitalized patients with NTI. The adoption of a wider TSH reference range (0.02 – 20 mU/L) for hospitalized patients leads to a slight decrease in sensitivity but improves the specificity of the test. TSH should be used in conjunction with FT4 measurement in this setting.

Approximately 2 % of elderly subjects, with no apparent thyroid dysfunction, have a low or high serum TSH (49). This may be due to non-thyroidal factors influencing the TSH/free T4 set-point. When TPO antibodies are negative and thyroid dysfunction (including goiter) is ruled out, such TSH abnormalities should not indicate the need for thyroid treatment. Serum TSH is typically higher in the neonate and pre-pubertal children; appropriate age-related reference ranges should be used for these patients (50) (51) (52). The recent follow-up study of the Whickham cohort suggests that the current TSH upper limit may be an overestimate. Specifically, this study found that individuals who had a serum TSH >2.0 mU/L at their primary evaluation had an increased odds ratio of developing hypothyroidism over the next 20 years, especially if TPO antibodies were elevated (53). Although methods cite lower TSH reference limits between 0.2 and 0.4 mU/L, mildly suppressed TSH values (i.e. 0.1 to 0.4 mU/L) are usually not clinically important when patients are asymptomatic and not taking thyroid medications. Thyroid hormone excess usually results in serum TSH concentrations that are consistently below 0.1 mU/L.

Figure 7. Serum TSH Measurement by Different Manufacturers' Methods


For Manufacturers:

For Laboratories:

Functional sensitivity is the most important performance criteria that should influence TSH method selection. Practical factors such as instrumentation, incubation time, cost, and technical support though important, should be secondary considerations. Laboratories should use calibration intervals that optimize functional sensitivity, even if re-calibration needs to be more frequent than recommended by the manufacturer. Specifically:

Normal Reference Ranges should be established independent of the manufacturer with a cohort of at least 100 TPOAb-negative euthyroid subjects, with no visible evidence of a goiter and no personal or family history of thyroid disease.

For Physicians:

TSH/FT4 Discordances. TSH assays are less prone to interferences than FT4 methods.

Elderly patients with no apparent thyroid dysfunction may have abnormal TSH, possibly due to a shift in TSH / free T4 setpoint. This is not an indication for thyroid treatment, if the patient has no goiter, negative TPO antibodies and is not taking medications likely to affect thyroid test results.


TSH/FT4 discordances where FT4 is likely to be the misleading test

FT4 normal/TSH high (ambulatory patient). Likely cause – mild hypothyroidism.Suggestion: Measure TPOAb as an indictor of underlying autoimmune thyroid disease.If TPOAb is negative, measure TSH by a different manufacturer's method to rule out interference. If the high TSH is confirmed, repeat TSH in 6 to 8 weeks.

FT4 normal/TSH low (ambulatory patient). Likely cause – mild hyperthyroidism.
When serum TSH is persistently below 0.1 mU/L, measure both T4 and T3 to exclude hyperthyroidism. A normal serum FT3 to rules out T3-toxicosis which occurs in ~5% of hyperthyroid patients in iodine replete areas, and may have a greater prevalence in areas of endemic goiter. Note: if the patient is elderly a low serum TSH may not indicate thyroid dysfunction if all causes of hyperthyroidism have been excluded.

FT4 abnormal/TSH normal (ambulatory patient). Likely cause – abnormal T4- binding protein.Suggestion: Check FT4 by a different manufacturer's method. If the abnormality is confirmed, check TSH by a different manufacturer's method to rule out HAMA interference.

FT4 abnormal /TSH normal (hospitalized patient). Likely cause - NTI effect on T4-binding.
Review medications and hypothalamic-pituitary status. Dopamine can suppress the high TSH of hypothyroidism to normal. If hypothalamic-pituitary function is normal, including medications affecting pituitary TSH, suspect that the FT4 abnormality is secondary to binding-protein changes. Check FT

FT4 normal /TSH abnormal (hospitalized patient). Likely cause - NTI effects on TSH.
Many euthyroid hospitalized patients have a transient TSH abnormality [~7% mildly subnormal TSH (0.02 - 0.3 mU/L) and ~6% mildly elevated TSH (4 - 20 mU/L)]. Mild nonthyroidal TSH abnormalities are transient, if not secondary to dopamine or glucocorticoid treatment. Hyperthyroidism should be excluded when TSH is profoundly suppressed (< 0.02 mU/L). A paradoxically normal FT3 may be a useful indicator of hyperthyroidism in a severely sick patient.

It is important that physicians also be aware of the opposing situation – serum TSH/FT4 discordances where TSH is the misleading test and FT4 is a more reliable indicator of thyroid status.

TSH/FT4 discordances where TSH is likely to be the misleading test

FT4 low/ TSH inappropriately low (i.e. < 10 mU/L). Possible cause - central hypothyroidism.
Look for non-thyroid signs of pituitary insufficiency. Check the TRH-stimulated TSH response- a less than three-fold increase in serum TSH above basal levels is consistent with pituitary dysfunction.

FT4 high/ TSH inappropriately high (i.e. > 0.1 mU/L). Likely cause - HAMA interference. Rare cause -TSH-secreting pituitary tumor.
Measure TSH by another manufacturer's method to check for interferences with the TSH measurement.
To exclude TSH-producing pituitary tumors TSH-alpha subunit studies, imaging and thyroid hormone suppression tests (described above) may be helpful.

FT4 normal / TSH abnormal post thyroid treatment. Likely cause - unstable thyroid status.
Rely on FT4 and FT3 measurements to indicate efficacy of therapy until the TSH/FT4 equilibrium has had time to reset (6-8 weeks following treatment for hypothyroidism, can be > 3 months following treatment for hyperthyroidism)..

FT4 high/ TSH inappropriately normal. Possible cause – interference with TSH measurement. Rare cause - thyroid hormone resistance (genetic).
Measure TSH by another manufacturer's method to check for interferences with the TSH measurement (i.e. HAMA).
Investigate thyroid hormone resistance by genetic studies and/or an oral thyroid hormone suppression test (described above). A less than expected 90% suppression of serum TSH below basal by 48 hours, is often seen with thyroid hormone resistance.


D. Thyroid Autoantibodies (TPOAb, TgAb and TRAb)

Autoimmune thyroid diseases (AITD) cause cellular damage and alter thyroid gland function by humoral and cell-mediated mechanisms. Cellular damage occurs when sensitized T-lymphocytes and/or autoantibodies bind to thyroid cell membranes causing cell lysis and inflammatory reactions. Alterations in thyroid function result from the action of stimulating or blocking autoantibodies on cell membrane receptors. Three principal thyroid autoantigens are involved in AITD. These are thyroperoxidase (TPO), thyroglobulin (Tg) and the TSH receptor. Other autoantigens, such as the Na/I symporter have also been described, but as yet no diagnostic role in thyroid autoimmunity has been established (54). TSH receptor autoantibodies (TRAb) can both mimick TSH action and cause hyperthyroidism as observed in Graves' disease or alternatively, antagonize the action of TSH and cause hypothyroidism. The latter occurs most notably in the neonate as a result of a mother with AITD. TPO antibodies (TPOAb) have been involved in the tissue destructive processes associated with the hypothyroidism observed in Hashimoto's and atrophic thyroiditis. The appearance of TPOAb usually precedes the development of thyroid dysfunction. Some studies suggest that TPOAb may be cytotoxic to the thyroid. The pathologic role of TgAb remains unclear. TgAb is primarily measured as an adjunct test to serum Tg measurement, because TgAb can interfere with Tg methodology. Laboratory tests to assess the cell-mediated aspect of the autoimmune process are not currently available. However, tests of the humoral response, i.e. the thyroid autoantibodies, can be assessed in the clinical laboratory. Unfortunately, the diagnostic and prognostic use of thyroid autoantibody measurements is hampered by methodological problems that are discussed below. Although autoantibody tests have inherent clinical utility in a number of clinical situations, these tests should be selectively employed.

(a) Clinical Significance

TPOAb and/or TgAb are frequently present in the sera of patients with autoimmune thyroid disease (AITD) (55). However, patients with AITD occasionally have negative thyroid autoantibody test results. TRAb are mostly present in patients with previous or present Graves' disease. During pregnancy, the presence of TRAb is a risk factor for fetal or neonatal dysfunction due to the transplacental passage of maternal TRAb (56, 57). The prevalence of thyroid autoantibodies is increased in patients with non-thyroid autoimmune diseases such as type 1 diabetes and pernicious anemia (58). Aging is also associated with the appearance of thyroid autoantibodies (59). The clinical significance of low levels of thyroid autoantibodies in euthyroid subjects is still unknown (60). However, longitudinal studies suggest that TPOAb may be a risk factor for future thyroid dysfunction, including post-partum thyroiditis (PPT) as well as the development of autoimmune complications of some treatments (61, 62). These include amiodarone therapy for cardiopathy, interferon therapy for chronic hepatitis C and lithium therapy for psychiatric disorders (63, 64). The use of thyroid autoantibody measurements for monitoring patients treated for AITD is generally not recommended (65). This is not surprising since treatment of AITD addresses the consequence (thyroid dysfunction) and not the cause (autoimmunity) of the disease.

(b) Nomenclature

There has been a proliferation of nomenclature used for thyroid autoantibodies, particularly in the case of TSH receptor antibodies (LATS, TSI, TBII, TSH-R and TRAb). The terms used in this monograph, TgAb, TPOAb and TRAb are those recommended internationally. These terms correspond to the molecular entities (immunoglobulins) which react with the specified autoantigen recognized by the laboratory test. Method differences may bias the measurement of these molecular entities, e.g.: methods may detect only IgG or IgG plus IgM; TPOAb or Ab directed to TPO + other membrane autoantigens; TSH inhibiting and/or TSH stimulating TRAb.

(c) Specificity

The use of thyroid autoantibody measurements has been hampered by specificity problems. Studies show that results vary widely depending on the method used. This is due to differences in both the sensitivity and specificity of the methods and the absence of adequate standardization. In the past few years, studies at the molecular level have shown that autoantibodies react with their target autoantigens, by binding to " conformational " domains or epitopes. The term conformational refers to the requirement for a specific three- dimensional structure for each of the epitopes recognized by the autoantibodies. Accordingly, assay results critically depend on the molecular structure of the autoantigen used in the test. Small changes in the structure of a given epitope may result in a decrease or a loss in autoantigen recognition by the autoantibodies targeted to this epitope. Recently, dual specificity TGPO antibodies, that recognize both Tg and TPO, have been demonstrated in patients with AITD (66). It has been known for years that autoantibodies are directed against few epitopes as compared to heteroantibodies. Current methods differ widely in epitope recognition. Specificity differences can result from misrecognition of an epitope that leads to a bias regarding the autoantibody population tested. This results in vastly different reference ranges, even when methods are standardized to the same international reference preparation. Whatever the targeted autoantigen, thyroid autoantibodies are clearly not unique molecular entities but, rather, mixtures of immunoglobulins that only have in common their ability to interact with Tg, TPO or the TSH receptor.

Thyroid Autoantibody Methods Differ in Specificity Because of:

  • Differences in epitope recognition sites.
  • Contamination of the antigen reagent with other autoantigens.
  • The assay design and signal used.
  • The use of different secondary standards.

(d) Sensitivity

Differences in the sensitivity of autoantibody tests may arise from the design of the assay (e.g. competitive RIA versus two-site IMA) as well as the physical method used for the signal (e.g. radioisotope counts versus chemiluminiscence relative light units). Differences in specificity may stem from contamination of the autoantigen preparation by other autoantigens (e.g. microsomes versus purified TPO). Further, misrecognition of an epitope may lead to an underestimation of the total amount of circulating autoantibody present, resulting in decreased sensitivity. Functional sensitivity should be determined with human serum pools containing a low autoantibody concentration and be determined using the same protocol as described above for TSH [Section IIIC(b)]. However, the between-run precision of autoantibody tests should be assessed across a longer time-period (6 to 12 months), consistent with the longer-term monitoring interval more often used for patients needing these tests.


Functional sensitivity assessments for autoantibody tests should be made with human serum pools containing a low autoantibody level, determined using the protocol described for TSH, but using between-run precision assessments made over a 6 to 12 month time period.

(e) Standardization

Standardization of thyroid autoantibody tests is currently inadequate. International reference preparations, MRC 65/93 for TgAb and MRC 66/387 for TPOAb, are available from the National Council for Biological Standards and Control in the UK. These preparations were made from a pool of sera from patients with autoimmune thyroid disease, which were prepared and lyophilized 35 years ago! It is well known that lyophilized antibodies are prone to degradation with time. Degradation could have introduced a bias in the binding activity of these reference preparations towards the less stable thyroid antibodies of unknown clinical relevance. Due to the scarcity of these preparations, they are only used as primary standards for calibrating assay methods. Commercial kits contain secondary standards that differ for each method. Currently, assay calibrations vary with the experimental conditions as well as the antigen preparation used by the manufacturer. This may introduce another bias in detecting the heterogeneous antibodies present in patient specimens. In the case of TRAb, no reference preparation is available.


  • An International Reference Preparation should be made available for TRAb assays and new standards should be prepared for TgAb and TPOAb.
  • Secondary standards should be thoroughly characterized to avoid bias in thyroid autoantibody tests.
  • Reference preparations of antigens should be used when available.


Thyroid Peroxidase Antibodies (TPOAb)

Thyroid Peroxidase (TPO) is a 110 kD membrane bound hemo-glycoprotein with a large extracellular domain, and short transmembrane and intracellular domains. TPO is involved in thyroid hormone synthesis at the apical pole of the thyrocyte. Several isoforms related to differential splicing of TPO RNA have been described. TPO molecules may also differ regarding their three-dimensional structure, glycosylation and heme binding. Most of the TPO molecules do not reach the apical membrane and are degraded intracellularly.

(a) TPOAb Methods

TPO autoantibodies were initially known as anti-microsomal autoantibodies (AMA) since they were found to react with crude preparations of membrane thyrocytes The microsomal antigen was later identified as TPO (67). Older AMA immunofluorescence assays as well as passive tanned red cell hemagglutination tests are still widely used in addition to the newer, more sensitive radioimmunoassays (RIA) and immunometric assays (IMA). The new methods are replacing the older AMA tests because they are quantitative, more sensitive and can be automated. There is however, wide variability between the new TPOAb immunoassay methods. Some variability stems from differences in the TPO preparations used in the assay kits. When extracted from human thyroid tissues, TPO may be used as a crude membrane preparation or may be purified by various methods. The assay specificity may differ because of contamination by other thyroid autoantigens –notably Tg and/or a variation in the three-dimensional structure of TPO. The use of recombinant human TPO (rhTPO) eliminates the risk of contamination but does not solve the problem of the differences in TPO structure that depend upon the technique used to express TPO. Most current TPOAb assays are quantitated in international units using the MRC 66/387 reference preparation. Unfortunately, the use of this primary standard does not alleviate inter-method variations as is evident from the wide variability in the sensitivities and reference ranges of the different methods (range <0.3 to >20 kIU/L).


Sensitive, specific TPOAb immunoassays, ideally using rhTPO as antigen, should replace the older insensitive, semi-quantitative anti-microsomal antibody (AMA) tests.

(b) TPOAb Prevalence & Reference Range

TPOAb prevalence estimates depend on the sensitivity and specificity of the method employed. The recent NHANES III United States survey of ~17,000 subjects without apparent thyroid disease, reported that 12 % of subjects without thyroid dysfunction, had detectable TPOAb levels measured by a sensitive RIA method (48). Whether very low levels of TPOAb detected in healthy individuals and/or patients with non-thyroid autoimmune diseases reflect normal physiology, the prodrome of AITD, or an assay specificity problem, remains unclear. Normal threshold values for TPOAb assays are highly variable and often appear to be arbitarily established, so that a large majority of patients with AITD test positive, and most subjects without clinical evidence of AITD test negative. Threshold values also relate to technical factors. Specifically, assays characterized by a low detection limit (<10 kIU/L) typically report undetectable TPOAb levels in rigorously selected normal subjects. Such methods suggest that the presence of TPOAb is a pathologic finding. This view is in accord with the recent 20 year follow-up study of the Whickham cohort which found that a detectable TPOAb level (measured as AMA) was not only a risk factor for hypothyroidism but that the TPOAb abnormality preceded the development of an elevated TSH (Figure 8) (53). In contrast, TPOAb assays reporting higher detection limits (>10kIU/L) typically cite a TPOAb "normal range". These methods appear to have no enhanced sensitivity for detecting AITD, suggesting that the "normal range" values may represent non-specific assay "noise" and are not pathologically meaningful. Clearly, the criteria used to select subjects for inclusion in the normal cohort, is critical. A cohort comprised of young biochemically euthyroid (TSH 0.5 to 2.0 mU/L) male subjects with no goiter and no family history of AITD, would be least likely to include subjects with a predisposition to AITD. Whether individuals with low levels of TPOAb and/or TgAb should be considered normal remains in question until long-term follow-up studies on such individuals show that they do not have increased risk for developing thyroid dysfunction.

(c) Clinical Uses of TPOAb Measurements

TPOAb is the most sensitive test for autoimmune thyroid diseases (68) As shown schematically in Figure 8, TPOAb is typically the first abnormality to appear in the course of developing hypothyroidism secondary to Hashimotos' thyroiditis. In fact, when TPOAb is measured by a sensitive immunoassay, >95% of subjects with Hashimotos thyroiditis have detectable levels of TPOAb. TPOAb is also detected in most (~85%) patients with Graves' disease (58). Patients with TPOAb detected in early pregnancy are at risk for developing post-partum thyroiditis (62). Recent reports have suggested that the IQ of children born to mothers with increased TSH and/or detectable TPOAb during pregnancy may be compromised (69-71). This has prompted recommendations that all pregnant women should have TSH and TPOAb levels measured in the first trimester of their pregnancy. Further, TPOAb measurements may have a role in infertility, since high TPOAb levels are associated with a high risk of miscarriage and in-vitro fertilization failures (72).

Figure 8. Changes in TROAb in Developing Autoimmune Thyroid Disease

Clinical Uses of TPOAb Measurement

  • Risk factor for Autoimmune Thyroid Diseases (AITD)
  • Risk factor for post-partum thyroiditis
  • Risk factor for miscarriage and in-vitro fertilization failure


Thyroglobulin Autoantibodies (TgAb)

Thyroglobulin (Tg), the prothyroid hormone, is a high molecular weight (660 000 Da) soluble glycoprotein formed of two identical subunits. Tg presents with a high degree of heterogeneity due to variation in post-translational modifications (glycosylation, iodination, sulfation etc...). During the process of thyroid hormone synthesis and release, Tg is polymerized and degraded. Consequently, the immunological structure of Tg is extremely complex. The characteristics of Tg preparations may vary widely depending on the starting human thyroid tissue and the purification process. This is a first hint that explains why TgAb assays (as well as Tg assays) are so difficult to standardize.

(a) TgAb Methodology

As with TPOAb methods, the design of TgAb assays has evolved from immunofluorescence of thyroid tissue sections, to passive tanned red cell agglutination, RIA and, more recently, IMA methodology. This technical evolution has improved the sensitivity and specificity of serum TgAb measurement. However, because both older and newer methods are still concurrently used in clinical laboratories, the sensitivity and specificity of current methods vary widely. Assays are calibrated with purified or raw preparations of TgAb by pooling patient sera or blood donor material. These various secondary standards are often, but not always, calibrated against the MRC 65/93 reference preparation (primary standard). Standardization with MRC 65/93 does not ensure that different methods are quantitatively or qualitatively similar. Other reasons for method differences relate to the heterogeneity of TgAb. In patients with AITD, the heterogeneity of TgAb is restricted as compared to heteroantibodies produced in rabbits and mice. In patients with other thyroid diseases such as differentiated thyroid carcinomas (DTC), the heterogeneity of the autoantibodies may be less restricted. This reflects differences in the expression of the autoantibody repertoire that may be normally expressed at very low levels in healthy subjects (73). The inter-method variability of serum TgAb values may also reflect qualitative differences in TgAb affinity and epitope specificity in different serum samples from patients with different underlying thyroid and immunological defects. Another reason for inter-method differences is that assay designs are prone to interference by high levels of circulating Tg, as could be the case in Graves' disease and metastatic DTC (74).


The epitope specificity of TgAb methods used for patients with DTC should be broader than the restricted epitope specificity typically associated with autoimmune thyroid disease.

(b) TgAb Prevalence & Reference Range

As with TPO antibodies, the prevalence and normal cut-off values for thyroglobulin antibodies depends on the sensitivity and specificity of the assay method (75). The NHANES III survey reported a TgAb prevalence of  ~10% for the general population, which is approximately half that of patients with DTC (~20 %) (48,75). The clinical significance of low TgAb levels is unclear. Suggestions are that low levels represent " natural " antibody in normals and " scavenger " antibody after thyroid surgery and radioactive iodide therapy in DTC patients; or underlying silent AITD (60). Different TgAb methods report different normal threshold values, as discussed above for TPOAb. Some TgAb methods find that "normals" should have values below the assay detection level, other methods report a "normal range". Since the primary use of TgAb measurements is as an adjunct test for serum Tg measurement, the significance of low TgAb levels relates less to the pathophysiology and more to the potential of low TgAb concentrations interfering with serum Tg measurements.

(c) Clinical Uses of TgAb Measurement

The NHANES III study found that 3 % of subjects with no risk factors for thyroid disease had detectable TgAb without associated TPOAb. Since this cohort had no associated TSH elevation, TgAb measurement does not appear to be a diagnostic test for AITD (75, 76). TgAb measurements are primarily used as an adjunct test to serum Tg measurements in the follow-up of patients with DTC. This is because TgAb can interfere with Tg methodology causing falsely low or high serum Tg values (74). The sensitivity and between-assay precision of TgAb tests used for this purpose are critical. Since even very low levels of TgAb interfere, it is essential that tests be highly sensitive. Further, serial TgAb measurements have been shown to be a useful tumor marker test for TgAb-positive DTC patients (75, 77). Specifically, TgAb-positive patients who are rendered athyreotic by their initial treatment, typically display progressively declining TgAb concentrations during their early post-operative years. In fact, most patients who are TgAb-positive at the time of thyroidectomy become TgAb-negative after several years. In contrast, a rise or appearance of serum TgAb concentrations is often the first indication of recurrence. The use of TgAb measurements for serial monitoring of TgAb-positive patients necessitates that the assay have excellent between-run precision across a 6 to 12 month period (ideally <10% CV) (75).

Clinical Uses of TgAb Measurement

  • TgAb is not useful for diagnosing AITD, because subjects with detectable TgAb and without TPOAb, rarely have evidence of thyroid dysfunction.
  • TgAb measurement is primarily used an adjunct test to serum Tg measurements because low levels of TgAb can interfere with serum Tg measurements.
  • Serial TgAb measurements have prognostic value for monitoring disease status of TgAb-positive DTC patients.


TSH Receptor Autoantibodies (TRAb)

The TSH receptor is a member of the superfamily of receptors with seven transmembrane domains linked to G proteins. The extracellular domain (397 amino acids) and extracellular loops of the transmembrane domain (206 amino acids) participate in ligand binding activity. Interaction with G proteins implicates the intracellular domain (80 amino acids) and intracellular loops of the transmembrane domains. Activation of G proteins by the hormone receptor complex results in stimulation of cAMP production by adenylate cyclase and of inositol phosphate turnover by phospholipases (78). At variance with the LH/CG receptor and the FSH receptor, the extracellular portion of the TSH receptor is post-translationally cleaved into two subunits. The complex structure of the TSH receptor makes it difficult to localize the epitopes targeted by autoantibodies. Studies carried out by several laboratories suggest that TRAb epitopes, as well as TSH binding domains, are distributed, throughout the extracellular parts of the receptor. Attempts to distinguish stimulatory from inhibitory TRAb epitopes have not provided clear results (79). To date, TRAb appear to differ with respect to epitope binding, stimulating or inhibiting activity and mechanism of action. Lack of correlation between TRAb levels and clinical status of patients suggest that circulating TRAb are heterogeneous and that stimulating and inhibiting TRAb may coexist in some patients (80).

(a) TRAb Methodology

The first observation of a thyroid stimulator differing from TSH in its longer half-life (Long Acting Thyroid Stimulator or LATS) was provided in 1956 using an in vivo bioassay. LATS was later identified as an immunoglobulin. Subsequently, in vitro methods were developed which improved the practicability as well as the sensitivity of measurements. TRAb assays separate in two classes:


These measure the overall biological activity of patients' circulating TRAb at various end-points (81). Bioassays remain beyond the means of routine clinical testing and are only performed in specialized laboratories because they rely on the use of cell cultures and on the measurement of the effect of TRAb on a cell function, most often but not only cAMP generation. Cloning of the TSH receptor greatly benefits bioassay development, as TSH receptor transfected cell lines are more useful tools than thyroid cells.

Receptor assays

These estimate the inhibition of labeled TSH binding to the TSH receptor by TRAb (82). Only recently have direct binding assays of TRAb been proposed (83). Receptor assays are the only commercially available methods. With these tests TRAb levels are expressed as the percent inhibition of radiolabeled bovine TSH binding to preparations of TSH receptor mostly from porcine origin. Recombinant TSH receptor preparations will soon become available (83). Current receptor methods for measuring TRAb vary widely and produce different values.

Clinical laboratories typically use TRAb receptor assays that measure the inhibition of binding of radiolabelled TSH to a TSH receptor preparation. These tests do not distinguish stimulating from blocking TRAb.

(b) TRAb Reference Ranges

No international reference preparation currently exists and the values obtained thus depend on the individual methods and the reference population used to determine the cut-off level for a positive result. This cut-off is generally defined as a difference of two standard deviations from the mean of normal subjects.

(c) Clinical Uses of TRAb Measurement

The clinical use of TRAb measurements in the diagnosis and follow-up of AITD remains a matter of controversy. The differential diagnosis of hyperthyroidism can be resolved in most patients without resorting to TRAb testing. Nevertheless, the presence of TRAb may distinguish Graves' disease from factitious thyrotoxicosis and unusual manifestations of hyperthyroidism in patients with subacute or post-partum thyroiditis and toxic goiter. TRAb measurements have also been proposed as a means for predicting the course of Graves' disease. A declining TRAb level is often seen in hyperthyroid patients in clinical remission after treatment with antithyroid drugs (ATD). After ATD withdrawal, very high levels of TRAb correlate rather well with prompt relapse but this situation concerns only a few patients. Conversely, a significant number of patients with negative TRAb will relapse. A meta-analysis of the relationship between TRAb levels and the risk of relapse shows that 25% of patients are misclassified by TRAb assays (65). This suggests that after ATD therapy, a follow-up of the patients is necessary whatever the TRAb level at the time of ATD withdrawal such that TRAb measurements are not cost effective. In contrast, TRAb measurements are mandatory to predict fetal and/or neonatal thyroid dysfunction in pregnant women with a past or present history of AITD (56). High levels of TRAb in the mother during the third trimester of pregnancy are strongly predictive of thyroid dysfunction in the offspring. It is worth noting that the receptor assays are favored for this purpose since they measure both stimulating and blocking TRAb. Both stimulating and inhibiting activities should be tested since the expression of thyroid dysfunction may be different in the mother and the infant (57).

Clinical Uses of TRAb Measurement:

  • To determine etiology of unusual presentations of hyperthyroidism.
  • To determine risk of neonatal thyroid dysfunction in mothers with past or present history of AITD.

(d) Future Directions

It is important that a well-structured comparative study of the commercially available thyroid autoantibody assays be performed. This would provide irrefutable evidence of differences in the performance of current assay methods. It would also help to convince clinical biochemists to avoid using assays that have poor clinical utility and encourage manufacturers to improve their products or drop them from the market.



For Manufacturers

For Laboratories

Laboratories should be aware that the clinical utility of thyroid autoantibody testing depends on the assay methods. The problem of specificity cannot be addressed directly at the level of a clinical laboratory. Hints for poor specificity may appear at the laboratory-clinical interface and should raise the question of changing the assay method. When selecting the methods it is important to recognize that the sensitivity of the TPOAb assay is less important than TgAb. The clinical significance of low levels of TPOAb remains to be established by longitudinal studies. In contrast, very low levels of TgAb may interfere with serum Tg measurement. It is critical to make a realistic determination of the functional sensitivity of the TPOAb and TgAb methods, since serial measurements may be used to track the progression of a thyroid condition or evaluate a response to therapy.


E. Thyroglobulin (Tg)

(a) Introduction

Thyroglobulin (Tg), the precursor protein for thyroid hormone synthesis, can be detected in the serum of all normal individuals. The serum Tg concentration integrates three principal factors: (1) the mass of differentiated thyroid tissue present; (2) any inflammation or injury of thyroid tissue which causes the release of Tg; and (3) the degree of stimulation of the TSH receptor (by TSH, hCG or TRAb). An elevated serum Tg concentration is a non-specific indicator of thyroid dysfunction, analogous to a "sedimentation rate" for the thyroid gland. Serum Tg measurements are primarily used as a tumor marker after a diagnosis of differentiated thyroid cancer (DTC) has been established. A pre-operative serum Tg measurement reflects the tumor's ability to secrete Tg, and validates the use of serum Tg measurement as a post-operative tumor marker. Post-operative serum Tg changes represent changes in tumor mass, provided that TSH is maintained constant by L-T4 therapy. Serum Tg measured during TSH stimulation [endogenous or recombinant human TSH (rhTSH)] is more sensitive for detecting disease than basal Tg measurements (during L-T4 treatment) (Figure 9) (84). The magnitude of the serum Tg increase in response to TSH provides a gauge of the TSH sensitivity of the tumor.

Figure 9. Tg Responses to rhTSH and T3 Withdrawal

(b) Current Status of Tg methods

The measurement of Tg in serum is technically challenging. Currently, immunometric assays (IMA) are gaining popularity over radioimmunoassay (RIA) methods. This is because IMA methodology offers the practical advantages of a shorter incubation time, a wider working range and a more stable labeled antibody reagent that is less prone to labeling damage than RIA (46). Laboratories can choose from a range of both isotopic (immunoradiometric, IRMAs) and nonisotopic, (primarily immunochemiluminescent, ICMA) IMA methods. However, since IMA methodology appears to be more prone to thyroglobulin antibody interference that can cause underestimation of serum Tg concentrations, there is a trend for laboratories to retain RIA methodology for measuring serum Tg in TgAb-positive patients and limit the using of IMA methodology to TgAb-negative patients. In addition to the problem of TgAb interference, current Tg methods are plagued by differences in standardardization, poor sensitivity, sub-optimal between-run precision and high dose"hook" effects (46).

(c) Standardization

Serum Tg concentrations measured by either RIA or IMA methods, vary widely (46, 85). A recent collaborative effort sponsored by the Community Bureau of Reference of the Commission of the European Communities has developed a new international Tg reference preparation, CRM-457 (86). This can be obtained from Dr. Christos Profilis, BCR, Rue de la Loi 200, B 1049 Brussels, Belgium. The universal adoption of this standard was projected to reduce method-to-method differences which would facilitate the comparison of scientific publications as well as improve the clinical utility of serial Tg monitoring of DTC patients who may have serum Tg measurements made by different laboratories. Unfortunately, the use of the CRM-457 standard has not decreased between-method variability as much as expected. Currently, the relative bias between CRM-457 standardized methods ranges between –50 and +225%! These method-to-method differences preclude the use of different Tg methods for monitoring thyroid cancer patients (see Appendix B). The bias between different Tg methods may result from differences between the Tg-free matrix used to dilute standards and patient sera or differences in the epitope specificities of the Tg antibody reagents used by different manufacturers. Ideally, the diluent used for standards should be Tg-free/TgAb-free human serum or alternatively, a non-serum matrix that has been selected to produce a signal (radioactive counts, relative light units etc) that is identical to Tg-free/TgAb-free human serum. It is critical that physicians be informed before the laboratory changes the Tg method to allow re-baselining of critical patients.

The expected normal reference range for CRM-457-standardized Tg methods is ~3 to 40 ng/ml.

(d) Sensitivity

Some Tg methods are too insensitive to detect the lower limit of the normal euthyroid reference range, which approximates to 3.0 ng/ml. Methods that are unable to detect Tg in the sera of all normal subjects have suboptimal sensitivity for monitoring DTC patients for recurrence. Low range between-run precision of 20% CV determines the Tg assay functional sensitivity. Tg assay functional sensitivity should be assessed by the same protocol described for TSH in Section IIIC(b), with the following stipulations:


Functional sensitivity should be established using the same protocol as described for TSH with three important differences:

  • Use human serum pools that have no TgAb detected by sensitive immunoassay.
  • Functional sensitivity should be assessed from a TgAb-negative serum pool with a Tg value between 1 and 2 ng/ml.
  • The interval used to assess between-run precision should be at least 6 months. This represents the typical monitoring interval used to follow DTC patients (in contrast with the 6 to 8 weeks for TSH in an outpatient setting).

(e) Precision

The within-run and between-run precision, expressed as percent coefficient of variation (% CV) are both important parameters for validating the performance of a Tg assay. Precision should be established at three levels -- low (1-2 ng/ml), medium (~ 10 ng/ml = mid-normal range) and high (90% of upper range limit of the method). Typically, within-run precision has a lower % CV than between-run precision (see Figure 10). This is because measurements made within a single run are not subject to the variability introduced by using different batches of reagents and different instrument calibrations. As shown in Figure 10, the longer the interval between runs, the more technical variability and the worse the between-run precision. Further, precision estimates made in the low-range that are used to determine functional sensitivity may be too optimistic if based on a non-human matrix, instead of a TgAb-free serum pool. It is important to establish the precision of the method in runs made 6 to 12 months apart, since this is a typical clinical interval used for monitoring DTC patients. The between-run precision across this time-span establishes the reliability of the method for long-term monitoring of an individual (see Appendix B). In contrast, within-run precision is the more relevant parameter when assessing the serum Tg response to rhTSH stimulation. In this setting, the basal and rhTSH-stimulated specimens are drawn 3 to 5 days apart and usually measured either in the same run or in runs made within the same week (Figure 9) (84, 87).

(f) High dose "hook" effect

A high dose hook effect affects primarily IMA methods. Falsely low values due to "hooks" are especially problematic for tumor-marker tests like Tg, because it is not unusual to encounter very high values when patients have metastatic disease (88). A hook effect occurs when an excessive amount of antigen overwhelms the binding capacity of the capture antibody. This results in an inappropriately low signal that translates into an inappropriately low or paradoxically normal range result for a patient with an excessively elevated serum Tg concentration (>1000 ng/ml) (46). Manufacturers of IMA methods try to overcome the hook effect problem by one of two approaches:

A "hook" is suspected when the dilution tube has a higher signal than the undiluted specimen. Further dilutions are made until the signal in the dilution tube decreases and the serum Tg concentrations of two dilutions are in agreement.

Figure 10. Erosion of Tg Assay Between-run Precision Over Time

(g) Thyroglobulin Autoantibody (TgAb) Interference

Thyroglobulin autoantibodies (TgAb) are detected in a higher percentage of DTC patients compared with the general population (~20 versus ~10 %, respectively) (75). Serial serum TgAb measurements may be an independent prognostic indicator of the efficacy of treatment or the recurrence of disease in TgAb-positive patients (75,77). Any TgAb in a serum specimen has the potential to interfere with a serum Tg measurement (89). Because TgAb is heterogeneous, neither the measured TgAb concentration nor an exogenous Tg recovery test can be used to reliably predict whether the TgAb in a specimen will interfere (75). IMA methods appear to be more prone to TgAb interference than RIA methods, as judged by the finding of undetectable Tg values in normal euthyroid subjects with an isolated TgAb abnormality. It appears that the IMA assay design fails to quantify Tg complexed with TgAb and this can result in an underestimation of the total Tg concentration in the specimen. In contrast, RIA methods generally appear to be able to quantify both the free and TgAb-bound Tg in the specimen, and typically produce higher values for TgAb-positive specimens than IMA methods (75,87). The underestimation of serum Tg typical of IMA measurement of TgAb-positive sera is the most serious type of interference encountered, since underestimation of a serum Tg value has the potential to mask metastatic disease. When sera containing TgAb are measured by both an RIA and IMA method, an RIA:IMA discordance (Tg >=2 ng/ml by RIA: Tg undetectable by IMA) is often observed. Such a discordance appears to be a characteristic of TgAb interference. Since the current rhTSH threshold for a positive response is 2 ng/ml, this magnitude of method-related discordance has the potential to impact clinical decision-making (84). Some RIA methods appear to produce clinically valid serum Tg data for TgAb-positive patients (75, 90). However the influence of TgAb on different RIA methods is variable and relates to the assay design, the specificity of the Tg polyclonal antibody reagent and the quality of the 125I-Tg tracer. There is now a trend for laboratories to restrict the use of IMA methods to the TgAb-negative patients while retaining older RIA methodology for TgAb-positive patients.

Warning: Immunometric assay (IMA) Tg methods are prone to TgAb interference that can cause an underestimation of serum Tg concentrations. In general, RIA methods are less prone to TgAb interference, but some may overestimate serum Tg concentrations.

(h) Reference Ranges

Normal Euthyroid Subjects

Serum Tg concentrations are log-normally distributed in normal euthyroid individuals. Exclusion criteria for normal subjects to use for establishing a reference range are:

Normal Subject Selection Criteria:

The Tg reference range should be established from the log transformed values of normal, non-smoking, euthyroid subjects (TSH 0.5 to 2.0 mU/L) with no personal or family history of thyroid disease and with no evidence of serum thyroid autoantibodies (TgAb and/or TPOAb).

Serum Tg Values after Thyroid Surgery

It is important for physicians to recognize that the normal Tg reference range cited on laboratory reports does not apply to patients who have had thyroid surgery! The serum Tg values of DTC patients should be interpreted on an individual basis relative to the completeness of the surgery, radioiodine ablation of normal thyroid remnant, the presence of TgAb and the serum TSH concentration. The serum Tg pattern is a more significant indicator of changes in the tumor burden of the patient than any single serum Tg value. Serum Tg measured on and off L-T4 treatment (low or high TSH, respectively) provide different information. The pattern of serum Tg change during L-T4 treatment is a more direct indicator of tumor mass than serum Tg measured when TSH is high (L-T4 withdrawal or rhTSH administration) prior to RAI scanning. This is because the magnitude of the TSH-stimulated serum Tg rise is influenced by the extent and chronicity of the TSH elevation, which may vary from scan to scan. However, as shown in Figure 9, serum Tg measured at high TSH is more sensitive for detecting disease confined to the neck than serum Tg measured when TSH during L-T4 treatment. Specifically, the serum Tg response to TSH stimulation is on average ten-fold higher than the basal serum Tg (on L-T4) (84, 87). The magnitude of the TSH-stimulated serum Tg response provides a gauge of the TSH sensitivity of the tumor. Poorly differentiated metastatic tumors that are RAI-scan negative have blunted (less than three-fold) TSH-stimulated serum Tg responses (91).


For Manufacturers

The Tg method package insert should cite realistic performance characteristics for the method, that can be reproduced across a range of clinical laboratories.

For Laboratories

Select the Tg method on the basis of its performance characteristics. Check the assay performance independent of the manufacturer's data.

For Physicians

Tg and TgAb methods are highly variable. It is important to use the same methods, preferably run by the same laboratory, for serial monitoring of differentiated thyroid cancer patients. Use split samples to re-baseline the serum Tg level of critical patients before switching to a different Tg method. Select a laboratory that understands the technical limitations of serum Tg measurements and has selected the Tg and TgAb methods on the basis of their performance criteria, not cost or expediency. Optimal performance criteria include:


Physicians can use the following questions to assess the characteristics of the laboratory Tg measurement service:

Question 1: Ask what type of method is used to screen for TgAb interference?
Significance: Even low TgAb concentrations can interfere with serum Tg measurement. TgAb should be measured quantitatively in every specimen using a specific immunoassay, not an insensitive qualitative agglutination test. Immunoassays are reported in IU/ml; agglutination is reported as a titer 1:100, 1400 etc. Recovery tests are not a reliable way to detect TgAb.

Question 2: Ask what type of Tg method is used, RIA or IMA?
Significance: TgAb can cause underestimation of serum Tg measurements made by IMA methods. This can lead to falsely undetectable Tg results for patients with metastatic disease. RIA methods provide more clinically relevant serum Tg measurements in the presence of TgAb.

Question 3: For laboratories using Tg IMA methods, ask how "hook" effects are eliminated.
Significance: Some IMA methods suffer from hook effects which result in inappropriately low serum Tg values for patients with metastatic disease and exceedingly high serum Tg concentrations (>1000 ng/ml).

Question 4: Ask about Tg method standardization and check that the normal reference range was independently established.
Significance: Even CRM-457-standardized methods vary. Standardization bias may be evident from the normal reference limits of the method which should approximate to 3 to 40 ng/ml. The lower limit of the normal reference should be easily detected above the sensitivity limit otherwise the method will be too insensitive to detect recurrences in DTC patients.

Question 5: Check whether the detection limit of the method (its functional sensitivity) was determined by the recommended protocol.
Significance: Laboratories tend to adopt an unrealistic sensitivity limit based on the manufacturer's recommendation rather than independently establish their assay detection limit. The functional sensitivity should be determined from the 20% CV between-run precision of a TgAb-free human serum pool, measured in runs covering a 6 to 12 month time period.


F. Urinary Iodine Measurement

(a) Introduction

Since an adequate dietary intake of iodine is necessary for thyroid hormonogenesis and thus for the maintenance of a euthyroid state, the measurement of iodine intake from foodstuffs or medicines has a clinical relevance. In the clinical laboratory, iodine measurements are used primarily as epidemiological or research tools. To date, the major application of iodine analysis is as a means of assessing the dietary iodine intake of a population (92, 93) This is an issue of considerable importance since it has been estimated that iodine deficiency disorders (IDD) potentially affect 2.2 billion of the world population. Even in developed countries such as the USA and Australia, a decline in dietary iodine intake has been demonstrated, while borderline dietary intake has long been a feature of much of Europe (94, 95). As the majority of ingested iodine is excreted in the urine, a measurement of urinary iodine excretion (UI) provides an accurate approximation of dietary iodine intake (95). In most circumstances UI estimation provides little useful information on the long-term iodine status of an individual as the value obtained merely reflects the recent dietary iodine intake of that person. However, measuring UI in a representative cohort from a population provides a useful index of the iodine status of that (95). Besides UI- estimation, other applications include the measurement of iodine in milk, other foodstuffs and drinking water (96, 97). Iodine measurement in thyroid or breast tissues have research applications (98). Its low concentration in serum (~ 1pg/dl) in the presence of relatively abundant hormonal iodine has restricted the measurement of plasma inorganic iodide (PII) to research studies in pregnancy (99).

(b) Expression of UI excretion

The UI-excretion of a cohort from a population can provide an accurate impression of the dietary status of that population. Iodine intake is best estimated from a 24-hour urine sample, but logistical considerations make in impractical to use such measurements in epidemiological studies. Dilution differences in un-timed urine specimens can be compensated by expressing results as g of iodine excreted/g of creatinine, which has become the norm (100). However, the UI estimation of iodine intake is most important in developing countries where the index may be less satisfactory for use in developing countries, where lower creatinine excretion secondary to varying degrees of undernourishment, would render the index less satisfactory (101). It has also been shown that urinary iodine excretion is variable even in healthy subjects. For these reasons, and also to avoid errors introduced in the performance of different creatinine assays, the WHO has recommended that for epidemiological studies UI may be expressed as g of iodine per volume (pg/dl or g/l) of urine. Differences in dilution inherent in utilizing casual urine samples can be balanced if sufficient numbers (~50) are assessed in each study population. The issue of the mode of expressing UI has been revised in a recent publication that suggested that the use of age and sex adjusted (UI/Cr) ratios approximated more closely to the true (24 hour) iodine excretion (102). Temporal factors both time of day and time of year may also influence UI results. Although seasonal variations may not apply in warmer climates, they do affect findings in Northern Europe where dairy milk provides the major source of dietary iodine. In such populations, the practice of indoor feeding of cattle with mineral rich supplements results in higher UI excretion during the winter months. More recently it has been suggested that UI has a diurnal variation, with values reaching a median in early morning or 8-12 hours after the last meal suggesting that samples might be collected at these times (103).

(c) Dietary Iodine

In many countries adequate dietary iodine intake is achieved through iodization of salt but the availability of iodized salt is mandatory in some countries but voluntary in others. Also there is evidence of a decline in iodine consumption in some industrial countries (95). Diminished iodine intake can result from vegetarian diets particularly if fruits and vegetables consumed are grown in iodine deficient soil (104).

(d) Units

For epidemiological purposes, iodine excretion is normally expressed as g of iodine excreted Conversion to equivalent SI units:

1.0 g/dl = 0.7874 MA

1.0pM/L = 12.7 g/dl

(e) Applications of iodine measurement

The major application of iodine measurement is in the measurement of UI in epidemiological surveys. Recommended daily iodine intakes are: - 90 g for children, 150 g for adults and 200 g for pregnant or lactating mothers and suggested norms for UI excretion as an index of the severity of iodine deficiency are shown in Table 2 (93).

Table 2. Severity of public health problem*
Iodine Deficiency None Mild Moderate Severe
Median UI g/L >100 50-99 20-49 <20
Goiter prevalence <5% 5.0-19.9% 2~29.9% >30%
*IDD Newsletter Aug 1999 15: 33-48 (with permission)


(f) Pregnancy and the Neonate

Thankfully the occurrence of severe iodine deficiency leading to endemic cretinism is no longer widespread. However iodine deficiency persists in large areas of the globe. The situation where dietary iodine deficiency may have more serious consequences is where maternal iodine deficiency may compromise thyroid status in the fetus and neonate (105). Reports on variation in UI excretion during pregnancy may vary. Some show a decline or no change, others show an increase (106-108). These differences may reflect variations in dietary iodine supply (109). It has been demonstrated that if during pregnancy dietary iodine intake is inadequate, there can be evidence of thyroidal stress, increased thyroid volume and serum Tg with a relative decrease in FT4 (106). Administration of iodide to pregnant mothers resulted in increased UI and the reversal of observed thyroidal changes. The need to avoid any thyroid hypofunction during pregnancy was emphasized by the recent report that children of even mildly hypothyroid mothers had defects in neuropsychological development (70,71). This finding supported earlier reports that plasma inorganic iodide (PII) declined during pregnancy. Early methods of measuring (PII) were based on the administration of a tracer dose of 131-I to patients and measuring the specific activity of radioisotope in serum and urine (100). Other methods depended on the ratio of iodide to creatinine in serum and urine (100, 110). A recent study using perchlorate digestion and the formula PII = Total Serum Iodine - Protein Bound Iodide concluded that, at least in iodine sufficient areas, there was no trend for PII values to be depressed during pregnancy (99).

(g) Excess Iodine Intake

It is well established that excessive iodine intake may, in susceptible individuals, lead to inhibition of thyroid hormonogenesis (the Wolff Chaikhoff effect) and this can be of iatrogenic origin (111). A similar excess of iodine intake by previously iodide deficient individuals with thyroid autonomy may result in hyperthyroidism (Jod Basedow) (112). Thus, it can be important to measure UI where excess iodine intake is suspected. Iodine excess can result from the use of iodine rich medications or antiseptics, more commonly ingestion of the iodine-rich cardiac anti-arrythymic amiodarone (21, 112). The thyroidal consequences of amiodarone ingestion may depend on the underlying dietary iodine status of the population in which the patient resides. Hypothyroidism is more frequent where dietary iodine intake is high (e.g. in the USA) and hyperthyroidism more frequent where intake is low (e.g. parts of Europe). Excess dietary iodine intake has also been implicated in increased autoimmune thyroiditis perhaps as a result of increased antigenicity of more highly iodinated thyroglobulin. For assessment of iodine excess in an individual, a 24hr urine collection is desirable. However, if a large excess suspected, a casual sample will usually suffice. It should be remembered that the organic iodine present in radiological contrast media may be taken up in body fat and a slow release of iodine with an associated high UI may persist for several months following ingestion (113).

(h) Iodine Measurement

Methods of measuring iodine content in biological specimens, have traditionally relied on some method of converting organic iodinated compounds to inorganic iodide and removing potential interfering substances (eg. thiocyanate) which may interfere with colorimetric measurement (114). This has involved a preliminary digestion step followed by colorimetric estimation of iodide by its catalytic action in the Sandell-Kolthoff (SK). In this reaction, Ce4+ (ceric ions) are reduced to Ce3+ (cerous ions) in the presence of As3+ (arsenious ions) which are oxidised to As5+ (arsenic ions) involving a change in color from yellow to colorless. This color change can be measured colorimetrically (at 420 m), following a 20 minute incubation at 37C. As this reaction is strictly time dependent some reports suggested stopping the reaction with addition of 1% brucine sulphate or ferrous ammonium sulphate and permitting colorimetric readings to be made at a later time. Further modifications of the SK reaction produced a kinetic assay that by altering the ratio of Ce/As ions and introducing kinetic measurement increased the sensitivity of the assay (115). The problem of identification and removal of substances such as thiocyanate which interfere with the SK reaction has been previously mentioned, and a report comparing 6 methods for iodine analysis attributed much of such interference to inadequate digestion procedures (114). Two major methods of sample digestion, dry ashing and wet ashing are routinely employed.

Dry Ashing

The dry-ashing technique first introduced in 1944 and was subsequently modified. It involves preliminary drying of specimens in an oven at 100C. The dried residue is then incinerated in the presence of strong alkali (KOH/K2CO3) for approximately 3 hours at 600C. The ash is then reconstituted in distilled H20 and the iodide content measured colorimetrically as described above. This is a somewhat time consuming and expensive method requiring thick-walled pyrex test tubes to withstand the high temperatures and a muffel furnace, ideally equipped with microchip temperature control. However, it does yield excellent results not only in urine samples but is also suitable for measuring the iodine content of foodstuffs and tissue samples that require complete digestion. Strict temperature control is particularly useful in preventing iodine loss should the temperature drift above 600C or if the time of incineration is extended (116, 117). It is also necessary that the iodine standards be subjected to incineration as added KOH is known to reduce the sensitivity of the SK Assay. These methods were developed for the determination of protein bound iodine (PBI) used as a measurement of thyroid hormone before the advent of specific radioimmunoassays for T4 and T3. As samples are incinerated together in a muffle furnace, the dry-ashing procedure is particularly susceptible to cross- contamination by a high iodine-containing specimen. To overcome this possibility some reports have suggested prior screening of samples to detect such specimens. The problem of cross-contamination is particularly acute in the dry-ashing procedure but has the potential to affect all iodine quantitation methodologies. It is therefore desirable that the iodine measurement area be kept as far apart as possible from other laboratory activities, particularly any which might involve use of iodine-containing substances. The aesthetics of handling and volatilising large volumes of urine resulting from epidemiological studies also makes such relative isolation desirable.

Wet Ashing

The most widely used method of digestion is the wet-ashing technique first proposed in 1951. In this method the urine specimens are digested using perchloric acid. This method has been automated using the Technicon autoanlyser. While this automated autoanalyser has found widespread usage, it does depend on the use of acid digestion and a dialysis module. The latter has been shown to be prone to significant interference by interfering with substances such as thiocyanate (114). Many variations of the wet ashing method of iodine measurement, principally aimed at simplifying the methodology, reducing the cost and rendering the method more suitable for on site use in epidemiological studies, have been published. Various methods have been described which yields similar results to established methodologies (118). For one such method the authors reported that a single technician could perform 150 tests per day at a cost of less than $0.50 each (118). More recently simplified methods using either acid digestion or UV irradiation of samples have been proposed (119). The wet-ashing technique has drawbacks in that chloric acid and potassium chlorate are potentially explosive and their use requires provision of an expensive fume hood. For this reason a less hazardous method of digesting using the urine samples using ammonium persulphate as oxidizing agent was proposed. A further modification involved incorporating the digestion and reaction process into a microplate (120). More recently an assay has been developed which allows rapid quantitative measurement of UI after charcoal purification and this is now available in kit form (Urojod, Merck KGaA, Darmstadt, Germany). This method is simple to perform and has the potential to be used for field use in epidemiological studies or for occasional use in the assessment of excess iodide ingestion (121).

(i) Sensitivity and Specificity

Assays using the SK reaction yield a sensitivity of between 10 and 40 g/1 that is more than adequate for UI measurement. Greater sensitivity has been reported using a kinetic assay (0.01pg/L) (115). Reported sensitivities using the inductively coupled plasma mass spectrometry (ICP-MS) technique are in the region of 2 g/L (108,122). Providing initial digestion is complete the SK assay is specific for iodide. However incomplete digestion can lead to interference by substances such as iodine containing medications, thiocyanates, ascorbic acid or heavy metals Hg or Ag (116). In expert hands the SK reaction yields excellent intra- and inter-assay precisions with values of < 5% CV being routinely achieved. This is provided that digestion is adequately controlled so that the recovery of the iodide standard is in the region of 90 to 100% (115,116,119).

(j) Non Incineration Assays

In addition to alkaline and acid digestion methodologies, other published methods for iodine determination have included the use of bromine in acid conditions as a digesting agent or ultraviolet irradiation (117). The iodide selective electrode has been used to measure iodine in various fluids including urine (123). In this case iodide activity is measured which approximates to iodine concentration. Drawbacks to this method are the fact that the electrode becomes coated and requires frequent polishing as well as interference from other ions such as sulphite. This is therefore not ideally suited for measurements in urine but can be used to measure iodide in other fluids and extracts of foodstuffs. Although not suitable for routine UI measurement, the technique can be usefully applied to the assessment of iodide overload in urine from patients treated with amiodarone or other iodine rich compounds (123). As the electrode only responds to iodide and not to iodinated compounds it can provide a useful means of specifically measuring iodide in the presence of other iodinated compounds. Many other techniques that are clearly unsuitable for routine clinical use include nuclear activation analysis, or HPLC. One method that has been widely reported is the use of (ICP-MS) (119, 124). This method provides good agreement with conventional digestion techniques using SK quantitation (119, 120). However, the necessary equipment is expensive and not readily available. Isotope dilution analysis has been applied to the analysis of both urine and drinking water (97). In vivo measurement of intrathyroidal iodine content has been achieved using X-ray fluorescence which may have relevance in the assessment of amiodarone induced hyperthyroidism (112).

(k) Summary

Measurement of iodine in tissues and biological fluids is unlikely to play a part in routine clinical biochemistry in the immediate future. However in view of the reports extent of IDD worldwide 2.2 billion affected and the recent reports that dietary iodine intake is declining in the United States and Australia, the assessment of UI as part of epidemiological studies will continue to be of considerable importance. Reference laboratories will no doubt continue to use dry- or wet-ashing techniques, depending on custom and availability of apparatus and space. Recent recommendations that laboratories " have several different methods available to allow the user to select the one best suited to specific needs" would seem a prudent course for centres specialising in iodine measurement.


For Clinical Biochemists


G. Thyroid Fine Needle Aspiration Biopsy and Cytology

(a) Introduction

The prevalence of palpable thyroid nodules in adults increases with age (average 4 -7% for US population) with nodules being more common in women (127, 128). Fortunately, 95% of these nodules are benign. Common methods available for assessing thyroid nodules include, fine needle aspiration biopsy (FNAB), thyroid scanning and ultrasound. Studies evaluating solitary or dominant thyroid nodules in euthyroid patients (normal TSH) have compared FNAB with the other modalities and report that the use of FNAB as the initial procedure is both more diagnostically valuable and cost effective (129). This is in part because, although isotopically "cold" thyroid nodules are considered suspicious for carcinoma, most benign thyroid nodules (cysts, colloid nodules, benign follicular lesions, hyperplastic nodules and nodules of Hashimoto's thyroiditis) also present as "cold" nodules. In addition, "warm" or isofunctioning nodules that do not completely suppress the surrounding normal thyroid tissue can be malignant. Likewise, thyroid ultrasound does not differentiate between benign and malignant lesions. In addition, ultrasound may detect non-palpable lesions that may be biologically unimportant. In general, ultrasound is useful for evaluating complex cysts and nodules that are difficult to palpate. Ultrasound is also used to determine the size of nodules and monitor nodule growth, as well as verify the presence of non-palpable nodules that have been incidentally detected by other imaging procedures. Ultrasound-guided FNAB should be reserved for individuals at high risk for differentiated thyroid carcinoma (DTC) in whom aspiration fails to yield adequate cellular material.

Since the risk of DTC in any given solitary nodule is low (~5%), it is more cost-effective to limit the FNAB procedure to those high-risk patients with solitary or dominant nodules of 1cm or more in diameter. Since FNAB became available in the 1970s, the rate of cancer confirmation for patients undergoing surgery for thyroid nodules has increased from 10-15% to 20-50%. When FNAB is performed by a skilled operator, and the aspirates read by an experienced cytopathologist, the procedure typically has a false-negative rate of < 5 % and a false-positive rate of approximately 1% (130).


Since only ~5% of thyroid nodules are malignant, FNAB should be restricted to patients at higher risk for DTC.

(b) Risk Factors for Thyroid Cancer

A number of factors suggest that a thyroid nodule is suspicious for carcinoma. These are:

Some of these risk factors are included in tumor risk-assessment protocols. The TNM protocol (tumor size, presence of lymph nodes, distant metastases) is a general tumor risk assessment scheme. A number of thyroid-specific staging protocols have been developed (4). These protocols are used to provide objective information necessary for establishing an appropriate treatment plan for the projected outcome. Although TNM is in general use, it can be misleading when applied to thyroid tumors. Specifically, with non-thyroid carcinomas the presence of lymph node metastases is a heavily weighted factor that negatively impacts on mortality. In contrast, differentiated thyroid cancers often arise in young women in whom the presence of lymph node metastases has little effect on mortality, but increases the risk of recurrence.

Recommendation for Physicians:

It is important that the endocrinologist, surgeon and cytopathologist act in concert to integrate the staging information into a long-term treatment plan, in order to assure continuity of care.

(c) Low risk for Thyroid Cancer

(d) Follow-up of patients with deferred FNAB

Follow-up frequency (i.e. 6 to 24 months) should be appropriate for the degree of diagnostic certainty that the nodule is benign. Efficacy of L-T4 therapy to suppress TSH is variable (134). The goal of follow-up is to identify patients with undiagnosed or subsequent malignancy, i.e.:

-- Placing tape over the nodule and outlining the borders with pen. Stick the tape into the patient's chart

-- Use a ruler to record the nodule diameter in two dimensions

-- Progressive nodule or goiter enlargement

-- Local compression and invasive symptoms (i.e. dysphagia, dyspnea, cough, pain, hoarseness)

-- Tracheal deviation

-- Regional lymphadenopathy

(e) Guidelines for who should perform FNAB

Ideally, physicians with clinical experience in FNAB of the thyroid and in the management of patients with thyroid nodules should perform FNAB. It is important that these physicians be able to review the slides with the cytopathologist and understand the results in order to recommend appropriate therapy based on the cytologic diagnosis. Ideally, the physician performing the FNAB should also be the physician responsible for the long-term management of the patient in order to assure continuity of care.

Recommendations for Physicians performing FNAB:

Thyroid gland aspirations should be performed by physicians who are:

  • Skilled in the technique.
  • Can review the slides with the cytopathologist and understand the interpretation.
  • Are able to render appropriate therapy depending on the results of the aspiration.

(f) Technical Aspects of FNAB

FNAB is typically performed by a variety of techniques usually employing 22 to 25 gauge needles and 10 or 20 ml syringes that may, or may not be attached to a "pistol-grip" device. Some physicians favor administering topical local anesthetic (1% lidocaine) while others do not. It is recommended that a minimum of three passes be made into various portions of the nodule. Slides are typically fixed in Papanicolaou's fixative and stained. It is imperative that this fixation be immediately applied to preserve nuclear detail. It is also useful to use a rapid stain, such as Diff-Quik and examine the slides at the time of aspiration to assess adequacy of specimen for cytologic evaluation. Other slides may be air-dried for alcohol fixation and subsequent staining (excellent for detecting colloid). Any additional material can be combined with material rinsed from the needle and spun down to form a cell block which can be embedded in agar. It is important to adequately protect the slides for transport to the laboratory. Slides should be submitted to the cytopathologist with the pertinent clinical details together with the size, location and consistency of the nodule. Firm nodules are suspicious for carcinoma whereas fluctuating or soft nodules, as well as cysts, suggest a benign process. When cyst fluid is aspirated the volume, color and presence of blood should be recorded together with a record of any residual mass left after aspiration. Clear fluid suggests a parathyroid cyst, whereas yellow fluid is more typical of a cyst of thyroid follicular origin. After aspiration, local pressure should be applied to the site of the aspiration for 10-15 minutes to minimize the likelihood of swelling. The patient can be discharged with a small bandage over the aspiration site with instructions to apply ice should discomfort occur later.

(g) Cytologic Evaluation

Thyroid cytologic interpretation can be difficult and challenging. The evaluation should assess:

If an experienced cytopathologist is not available locally, slides should be sent to an outside expert for review. In the future, electronic reviewing of cytopathologic specimens will increasingly become available as tele-cytopathology technology develops.

(h) Special Tissue Stains

Special tissue stains can be helpful in the following situations:

Recommendation for Selecting a Cytopathologist:

The cytopathologist should have an interest and experience in reading thyroid cytology. Alternatively, slides should be sent for review to an outside cytopathologist with thyroid expertise.

(i) Diagnostic Categories

Some cytopathologists believe that there must be at least six clusters of follicular cells on each of at least two slides in order to accurately diagnose a thyroid lesion as benign (129). A cytologic diagnosis of malignancy can be made with a fewer number of cells, provided that the characteristic cytologic features of malignancy are present. The classifications described below and their clinical relevance should be easily understood.

Recommendation for Physicians managing DTC patients:

It is critical that physicians responsible for the long-term management of the patient review the slides with the cytopathologist and understand the cytopathology interpretation in order to assess the risk of recurrence and to establish a meaningful long-term management plan for the patient.

Benign (~ 70% of cases)

Physical presentations suggesting a benign condition (but does not exclude malignancy):

Cytologic and/or Laboratory Analyses suggesting a benign condition include:

Benign Conditions

This diagnostic category should include, but not be limited to, the following:

Recommendations for Follow-up of Patients with Benign Disease:

Annual follow-up should include physical exam, measurement of nodule size with ultrasound, tape or ruler. It is recommended that enlarging lesions be re-aspirated.

Malignant Conditions (~ 5-10% of cases)

Malignancies are ideally treated by near-total thyroidectomy performed by an experienced surgeon.

Papillary Carcinoma (~ 80% of malignancies)

Includes variants:

Cytologic/Histologic Features suggesting a papillary malignancy include:

Follicular or Hurthle Cell Neoplasms (~20% of malignancies)

Lesions in this diagnostic category display cytologic evidence that may be compatible with malignancy but is not diagnostic. Definitive diagnosis requires histological examination of the nodule to demonstrate the presence of capsular or vascular invasion. Re-aspiration may be considered for diagnoses in this category, but its usefulness in giving a definitive answer is questioned. There are currently no genetic, histological or biochemical tests that can differentiate between benign and malignant lesions in this category. Many surgeons consider that an intra-operative frozen section offers minimal value in differentiating malignant from benign lesions. Sometimes a staged lobectomy is performed followed by a completion thyroidectomy within 4 weeks if capsular or vascular invasion in the histologic specimen indicates malignancy.

Cytologic/Histologic Features suggesting Follicular or Hurthle malignancy include:

Cytology/Histology Reports. These lesions may be reported as:

Medullary Carcinoma (1-5% of thyroid malignancies)

This type of thyroid cancer should be suspected in patients with a family history of medullary cancer or multiple endocrine neoplasia (MEN) Type 2.

Cytologic/Histologic Features suggesting this malignancy include:

Anaplastic Carcinoma (<1% of thyroid malignancies)

This type of thyroid cancer usually only occurs in elderly patients who present with a rapidly growing thyroid mass. It is necessary to differentiate between anaplastic carcinoma and thyroid lymphoma for which treatments are available.

Cytologic/Histologic Features suggesting this malignancy include:

Inadequate/Nondiagnostic (~ 5-15 % of FNAB)

A cytologic diagnosis cannot be reached if there is poor specimen handling and preparation or inadequate cellular material obtained at the time of FNAB. Reasons for insufficient material for diagnosis may be inexperience on the part of the physician performing the procedure, insufficient number of aspirations done during the procedure or the presence of a cystic lesion. Repeat FNAB of the nodule should be undertaken as it can often yield adequate cellular material in order to make a diagnosis. In some cases, ultrasound-guided FNAB may be indicated.

Recommendations for patients with inadequate or non-diagnostic FNAB:

Repeat FNAB of the nodule often yields adequate cellular material for a diagnosis. Rarely, ultrasound-guided FNAB may be indicated for small nodules.


H. Screening for Congenital Hypothyroidism

(a) Introduction

The prevalence of primary congenital hypothyroidism (CH) (approximately 1:3500 births) is greater than that of central CH (approximately l:100,000) and is even higher in iodine deficient regions of the world. Over the last 25 years, screening for CH by either TT4-TSH or TSH measurements made in whole blood spotted on filter paper, has become established practice in the developed world as part of the screening for a variety of genetic conditions. In order to maximize efficiency, screening programs are frequently centralized or regionalized and operated according to strict guidelines and licensure requirements. (Guidelines for screening for CH were published by the American Academy of pediatrics in 1993, by the European Society for Pediatric Endocrinology in 1993, and updated in 1999 (133-135). Participating laboratories may be either private or governmental, but must be involved in a quality assurance program and pass performance challenges.

Thyroid dysgenesis resulting in either aplasia, hypoplasia or an ectopic thyroid gland is the most common cause of CH and accounts for approximately 85% of cases (4). Inactivating mutations in the TSH receptor have been reported from a number of screening centres, but the prevalence is still unknown. The phenotype associated with TSH resistance is variable but appears to be of two types, partial or severe. Those with a TSH elevation due to partial TSH resistance are euthyroid, have normal TT4 and may not require L-T4 replacement. Another rare cause of CH (six patients) is a mutation of one of the genes encoding for the thyroid transcription factors, TTF-1, TTF-2 and PAX-8. These factors play a key role in controlling thyroid gland morphogenesis, differentiation and the normal development of the thyroid gland in the fetus. They bind Tg and TPO promotors to regulate thyroid hormone production.

(b) Criteria needed for CH screening laboratories

Only laboratories with experience of automated immunoassay procedures, information technology and computer back-up and which have appropriately trained staff should undertake screening for CH. Screening should take place on a daily basis so that results are immediately available and can be acted upon. Treatment is most effective if started by 21 days of age. The minimum number of newborns that should be screened per year is debatable and relies on the fact that analytical proficiency is best accomplished when reasonable number of positive cases are encountered and cost efficiency is usually best realized at higher volume of testing. The screening program should ensure that the follow-up testing is done on infants with positive screening results and that access to experienced diagnostic expertise is available. A consultant pediatric endocrinologist should be available for referral for follow up testing to ensure that the correct diagnosis and treatment is achieved.

(c) Screening methods

All screening programs for CH rely on tests performed on blood eluted from filter paper spots, collected from infants by heel prick. Two different world-wide strategies have evolved.

Primary T4 with reflex TSH measurement

Most North American screening programs use an initial TT4 measurement, with mandated TSH testing of specimens with a low T4 concentration (usually less than the 10th percentile). Historically, this approach was adopted because the turnaround time of the early TT4 assays was shorter, the test kits were more reliable, screening was performed earlier in these programs (usually at 1-2 days of age), and the cost was less than that for TSH testing. Although the measurement of FT4 in serum is readily available commercially, FT4 methods are not usually employed for screening because of sensitivity limitations due to the small sample taken from filter paper blood spots and the dilution resulting from the elution of specimen (136). The TT4-first screening approach has some advantages, particularly in programs where samples are collected early. It is less influenced by the TSH surge that follows the cutting of the umbilical cord and lasts for the first 24 hours when TT4 screening may result in fewer false-positives. Further, TT4-first detects the rare cases of central hypothyroidism that would be missed with a TSH-first approach. The disadvantages of TT4 screening relate to the difficulties in setting the TT4 cut-off value low enough to minimize false-positives, but high enough to detect CH in infants with ectopic thyroid glands who may have TT4 concentrations above the 10th percentile. In addition, a low TT4 and normal TSH is encountered in a number of other conditions: (a) hypothalamic-pituitary hypothyroidism (b) thyroxine binding globulin (TBG) deficiency (c) prematurity (d) illness or (e) a delayed TSH rise. In programs where the follow up of infants with secondary or tertiary hypothyroidism has been carried out, only 8 of 19 cases were detected by TT4 screening, seven were diagnosed clinically before screening and four, although having low TT4 concentrations on screening, were not followed up (137-139). TBG deficiency has no clinical consequence such that the detection and treatment of such conditions is contraindicated. Where TT4 may also be useful is in the very low birth weight infants (< 1500g) in whom TSH is normal at the usual time of screening, and only begins to rise weeks later.

The TT4 –first approach has advantages in very low birth weight infants and when CH screening must be made before an early discharge.

Primary TSH Measurement

Europe and much of the rest of the world have adopted TSH measurement as the CH screening assay. Primary TSH screening has advantages over TT4 screening in areas of the world with iodine deficiency since neonates are more susceptible to the effects of iodine deficiency than adults and such infants have an increased frequency of high blood spot TSH concentrations. This makes it possible to monitor the iodine supply in the newborn population, especially since many European countries are still iodine deficient (140). Additionally, there is little cost difference now between TSH testing kits and those for TT4. The TSH concentration used for recall varies between programs. However, in one program a two-tier approach has been adopted (141). Specifically, if the initial blood spot TSH is <10 mU/L, no further follow up is done. If the TSH is between 10 and 20 mU/L, a second blood spot is collected from the infant. TSH is normal in most of these repeat specimens. However, if the TSH is >20 mU/L the infant is recalled to be assessed by a consultant pediatrician and thyroid function tests are performed on a serum sample. This approach ensures that the mildest forms of hypothyroidism with only a modest increase in TSH are followed up, although it produces a higher number of false positives that must be followed through the system. Although most results above 20 mU/L are due to congenital hypothyroidism, it is important to rule out the maternal ingestion of antithyroid drugs or the use of iodine antiseptic solutions at delivery as a cause of transient TSH elevations.

Primary TSH testing is recommended in preference to primary TT4 + mandated TSH in countries that have iodine deficiency.

(d) Blood Spot Assays for TSH

Screening assays for CH require TSH to be measured in blood spots as small as 3-4 mm in diameter. The new "third generation" TSH IMAs with functional sensitivities down to 0.02 mU/L are well suited to modification for this purpose. However, not all manufacturers have developed blood spot TSH assays since it is considered a specialty and limited market. Microtitre-plate assays using non-isotopic signals, such as time resolved fluorescence, are well suited to being adapted for blood spot specimens and are in widespread use. An advantage of these systems is that as elution of the blood spot is done in the microtitre well, all of the TSH in the sample is available for binding to the monoclonal capture antibody on the wall of the well. Other automated systems that do not use a microtitre well format can be successfully used for blood spot TSH assays. These usually require an off-line elution of the TSH from the blood spot and a sampling of an aliquot of eluate by the automated immunoassay analyser. Some of these systems have the advantage that results are available after 20 minutes from sampling and are produced at the rate of 180 per hour. Additionally, these systems incorporate positive identification of the sample, making the identification of an increased blood spot result with the correct patient more secure. The automated punch has been designed so that bar-coded labels with a unique number placed on the elution tubes are read before punching, the same identification is then printed on the patient's filter paper card. The automated immunoassay analyzer reads the same bar-coded labels on the elution tubes and results are printed or downloaded to laboratory host computers against unique identification number and patient demographics if these had been previously entered. For those laboratories without automation, TSH assays utilizing antibody coated tube assays are suitable, but are not amenable to full automation.

Recommended Criteria for Blood Spot TSH Methods:

  • Functional sensitivity = 0.02 mU/L.
  • Between run CV ideally <10% and not more than 15%.
  • Internal quality control samples included in every run.
  • Participation in National, European or International external quality control programs (see Appendix C).

(e) Sample Collection

The technique for taking blood samples by heel prick onto filter paper cards is of the utmost importance. This requires a continuous training program, well-written protocols and establishing of criteria for adequacy of specimens.

The decision as to when to take a sample will be determined by the requirements of other newborn screening programs and whether the sample is taken in the hospital or at home. Most European screening programs take samples at home between 3-6 days after birth. In many other screening programs most specimens are drawn before discharge. Increasing economic pressure is prompting early discharge. Sample collection time impacts the TSH-first strategy more than TT4-first because a TSH surge occurs once the umbilical cord is cut. In the majority of infants this increase in TSH returns to normal within 24 hr, but in some remains increased for up to 3 days. For pre-term infants, a second sample, collected 2 to 4 weeks after the first, is advisable since in some cases there is a delayed TSH rise, perhaps due to immaturity of the pituitary-thyroid feedback mechanism (142).

(f) Confirmation Testing

An abnormal primary screening result (TT4 or TSH) requires confirmation. Confirmatory blood samples should be drawn by venipuncture. A blood sample should also be collected from the mother at the same time to check her thyroid function since TSH-blocking antibodies (TRAb) present in a mother carrying a diagnosis of hypothyroidism (even when receiving adequate L-T4 replacement) can cause transient hypothyroidism in the neonate. Follow-up testing with serum FT4, TSH and TPOAb should be measured in both mother and infant. It is important to note that serum FT4 and TT4 are higher in the neonatal period, so that borderline results in infants with mild hypothyroidism should be compared with age-related reference intervals for the particular assay used. Whilst the aim of CH screening is to detect CH and start treatment as early as possible (within 14 days), additional tests to determine the etiology of CH should also be carried out to determine whether the condition is transient, permanent or genetic (needed for genetic counseling) (Table 3). Some of these tests need to be done before L-T4 treatment begins, whilst others can be done during therapy. In cases of transient hypothyroidism due to transplacental passage of TRAb from mother to infant, treatment with L-T4 is indicted since the presence of blocking antibody in the neonate inhibits the action of TSH resulting in a lowered FT4 concentration. Once the TRAb have been degraded over a period of three to six months, depending on the amount of antibody present, then L-T4 treatment can be gradually discontinued. The mother's antibody status should be monitored in any subsequent pregnancies as thyroid antibodies can persist for many years (143).

In many cases, at the time of a diagnosis of CH, it is impossible to decide whether the hypothyroidism is permanent or transient. Especially, if circumstances render it impractical or impossible to carry out some of the procedures recommended above. Even with clues that are associated with transience, such as TSH below 100 mU/L, male sex, pseudohypoparathyroidism, prematurity, iodine exposure, or dopamine administration, the diagnosis may be in doubt (138). In such instances it is best to manage the patient as if she/he had permanent hypothyroidism (144). If the diagnosis has not become apparent by the age of 3 or 4 years, L-T4 therapy should be discontinued and the infant monitored with serial determinations of FT4 and TSH.

(g) Tests for CH Etiology

The tests that can be used to establish the diagnosis of CH and its etiology are shown in Table 3. The ordering of such tests is usually the responsibility of the consultant pediatrician and not the screening program. Thyroid scintigraphy is useful to document the presence of any thyroid tissue present and its location. Serum Tg is more sensitive than scintigraphy to detect residual functioning thyroid tissue and may be normal or low in cases where scintigraphy shows no uptake. When uptake shows a normal or enlarged gland, testing should be directed towards detecting an inborn error of T4 synthesis (~ 10% of cases) or a transient cause such as trans-placentally acquired TRAb. A perchlorate discharge test > 15% indicates an inborn error. Further tests may include urinary iodine measurement and tests for the specific gene mutation, such as the sodium/iodide symporter, TPO or thyroglobulin in specialist centers (145). More commonly, defects in the oxidation and organification of iodide to iodine and coupling defects result from mutation in TPO. Mutations in the thyroglobulin gene give rise to abnormal thyroglobulin which may result in defective proteolysis and secretion of T4. Deiodinase gene mutations give rise to deiodinase defects.

Table 3. Diagnostic Tests in the Evaluation of Congenital Hypothyroidism (CH)
1. To Establish the Diagnosis:




3. To Establish Etiology:


-- Determine size and position of thyroid

-- Radiology: scintigraphy -- either 99mTc or 123I

-- Functional studies:

-- 123I uptake
-- Serum thyroglobulin (Tg)

-- Inborn error of T4 production is suspected:

-- 123I uptake and perchlorate discharge test

-- If iodine exposure or deficiency is suspected:

-- Urinary iodine determination


-- If autoimmune disease present:

--TSH Receptor antibody (TRAb)

(and also in infant, if present in mother)

(h) Missed Cases

No biochemical test is 100% diagnostically and technically accurate. One study in which screening checks were made after two-weeks of age revealed that 7% of cases of CH were missed using the TT4-first, and 3% missed with the TSH-first, approach. Recommendations are needed to address the clinical, financial and legal ramifications of false-negative screening tests and whether mandated retesting at 2 weeks as is practiced in some programs, are desirable.

(i) Quality Assurance

All screening programs should have a continuous system for audit and publish an annual report of the outcome of the audit. By this means, an appraisal can be made of each aspect of the screening procedure against nationally agreed quality standards. Although laboratories generally comply with quality standards in that they have participated in quality assurance schemes, the pre-analytical and post-analytical phases of screening typically receive less attention. Quality assurance schemes should address the following:




(j) Annual Report

This should include items identified under audit and be a comprehensive report of screening for CH over the previous twelve months. It should monitor the distribution of increased blood spot TSH concentrations, and a system set up to report all cases of true CH and record cases of transient hyperthyrotrophinaemia. This system could also provide information by reporting from pediatric colleagues on any missed cases. A close collaboration between the screening laboratory, pediatricians, midwives and all concerned in the screening process needs to be established to maintain an efficient screening program.


For Physicians

Potential pitfalls in screening for congenital hypothyroidism are ubiquitous and no laboratory is immune. Despite all safeguards and automated systems, infants with congenital hypothyroidism will occasionally be missed by screening programs. Therefore, it is incumbent on physicians to maintain a high degree of vigilance and not be lulled into a false sense of security by a laboratory report bearing normal thyroid function values. Never hesitate to request repeat tests when the clinical picture conflicts with the laboratory results.


Back -- Table of Contents -- Forward