Orphan drugs for the treatment of rare diseases, defined in the United States as diseases that affect fewer than 200,000 Americans, represent a significant revenue opportunity for the pharmaceutical and biotechnology industries. High price tags and incentives such as tax credits and extended exclusivity have encouraged investment in orphan drugs over the past several decades. Nevertheless, unique challenges, particularly with respect to the design and conduct of clinical trials, make investment in orphan drugs a somewhat risky proposition.
In light of these uncertainties, accurate market sizing becomes even more important. Typical market sizing methods, including patient-based models (which begin with the number of potential patients), and prescription-based models (which begin with the number of patients currently receiving therapy), are more limited for rare diseases. Out of approximately 7,000 known rare diseases, less than 5% have effective therapies. If no therapies are currently available, a patient-based analysis is often the only reasonable option.
Estimating Rare Disease Occurrence
One of the first steps in a patient-based analysis is estimating the prevalence or incidence of the disease. Prevalence is defined as the number of existing cases in a specific population at a set point in time. Incidence is defined as the number of new cases that occur in a specific population during a set period of time. For both these measures of disease occurrence, some mechanism must be used for counting the number of cases within a population of known size. Example data sources that can be used to accomplish this include insurance claims databases, population-based disease registries and large clinics with well-defined catchment areas.
Conducting a worthwhile study designed to estimate disease occurrence can be particularly difficult for rare diseases, since factors such as lack of a unique ICD code, delays in accurate diagnosis and overall rarity of the condition can often rule out certain options, such as analyses of insurance claims databases. Population-based registries, widespread screening, or other methods of identifying all cases in a defined population can be very useful for estimating the occurrence of rare diseases, but conducting such studies strictly for the purposes of informing drug development is often not feasible given the costs associated with these systems and the limited resources available for forecasting. As a result, existing studies in the scientific literature are often a key source of information on the occurrence of rare diseases for market sizing.
View Published Data With Caution
Although existing studies in the published literature may be the only source of information for estimating the occurrence of a rare disease, cited estimates should not be taken at face value. For most rare diseases, robust information on disease occurrence is lacking, and developing the most appropriate estimate based on the available literature necessitates methodological understanding and analytical effort. Failure to consider the methodology of the available estimates may result in the inappropriate application of literature-based estimates to the market-sizing model.
The terms “incidence” and “prevalence,” for example, are frequently confused in the medical literature, and/or will be used generally to apply to calculations which actually represent a variety of different statistics, such as population prevalence, point prevalence, period prevalence, birth prevalence, etc. Sometimes statistics will become so frequently cited within a specific rare disease literature they become convention. But upon tracing of the source, these statistics are found to be based on expert conjecture rather than on any specific data.
Adjust When Data Are Sparse
Often, studies are outdated and/or are not available in the geographic area of interest. In this case, decisions regarding the most appropriate surrogate data become important. Factors such as variable disease expression and age of onset, gene-environment interaction and the increasing availability of genetic counseling and pre-implantation genetic testing are likely to have an impact on the reported and/or true disease occurrence rates over time. Rates may vary geographically due to founder effects, differences in exposure to disease risk factors and differences in racial/ethnic composition. When a high-quality, recent estimate for the country of interest does not exist, the most applicable existing estimate(s) should be carefully selected and adjusted, if necessary, based on available information.
The methods for calculating disease occurrence must be compatible with the target population for the market-sizing model as well. For example, it is important to determine whether a study is calculating birth prevalence or population prevalence, a distinction that is particularly important for diseases with reduced life expectancy. The case ascertainment methods of each study should be carefully assessed, paying special attention to the case definition used, including whether certain disease subtypes were included or excluded, the means of identifying cases (e.g., genetic test, clinical impression, patient self-report), the age range of patients captured by the method and other important disease-specific factors.
Other issues to consider include likely sources of under-ascertainment. Even high-quality studies that make exhaustive attempts at case finding will often miss some cases. An example: Significant biases are associated with studies of genetic disorders in which cases are mostly identified through families or in studies where only certain aspects of the phenotypic spectrum are likely to result in detection and diagnosis.
Other potential sources of under-ascertainment include lack of access to healthcare and incomplete participation in surveys or studies. It is not standard practice for study authors to correct for these sources of under-ascertainment when reporting prevalence estimates. Particularly for rare disorders, this can have a large impact on estimated prevalence rates. Failure to adjust for these potentially missed cases can lead to the erroneous assumption that the number of missed cases is zero. Thus, where possible, the extent of under-ascertainment should be estimated and applied to the reported estimate as a correction factor.
CASE STUDY: Huntington’s Disease in the U.S.
Huntington’s disease (HD) is an inherited, progressive neurodegenerative disorder that poses a considerable burden upon patients and caregivers. While HD has been documented as far back as the Middle Ages, no clear consensus has been reached in the epidemiologic and medical community about the prevalence of HD in the U.S. Both medical and patient-based organizations frequently cite an estimated 30,000 cases of HD in the U.S., equivalent to a prevalence rate of about 1 in 10,000 persons. However, diligent tracing of potential sources of this estimate was conducted and determined it to be unsubstantiated. For example, many papers cited the National Institute of Neurological Disorders and Stroke (NINDS) as the source, but the origins of the data underlying this rate are unclear.
Therefore, an in-depth analysis of HD prevalence estimates in the scientific literature was undertaken in order to provide a more reliable, data-based estimate appropriate for use in a pharmaceutical forecasting model. A literature search focused on the prevalence of HD in the U.S. revealed mostly outdated and conflicting estimates, as well as questions about the completeness of case ascertainment.
Meeting the Challenge of Market Sizing
Complete confirmation of all HD cases in a population presents a number of challenges. Not all affected individuals require hospitalization, so the use of hospital records or insurance claims data is not sufficient for enumerating all cases. Genetic testing records are also insufficient, since it is estimated that less than 10% of the at-risk population pursues testing—and there is no mechanism for mandatory or uniform reporting of newly diagnosed cases in the U.S.
Fortunately, it was possible to correct for under-ascertainment bias in published reports by using a correction factor derived from capture-recapture methods. This correction factor was applied to the selected studies that were deemed the most methodologically valid and appropriate for the model, and a meta-analysis was performed to estimate an overall prevalence rate as a weighted average of the individual study (corrected) prevalence rates. Parameter estimates were weighted using standard techniques as well as by a factor based on the quality and relevance of each individual study.
The resulting prevalence estimate was applied to current and future projected population data to provide the necessary model inputs, and resulted in a count that was considerably less than the 30,000 cases cited in the literature. Historical trends in prevalence and mortality rates, including the influence of diagnostic changes, genetic shifts and changes in reproductive behaviors of affected individuals and gene carriers were considered. Data were obtained on patient counts in existing HD registries and various additional sensitivity analyses were performed. All sources supported the validity of this lower estimate for the current HD prevalence in the U.S.
Reporting of this analysis prompted NINDS to change their cited estimate for the prevalence of HD in the U.S., and careful consideration of methodological and contextual factors relevant to the estimation of HD prevalence resulted in a model that better reflected current and future trends in the burden of disease.
As this case study illustrates, it is important to conduct a rigorous methodological analysis before accepting published disease rates for rare disorders. Simply selecting the most frequently cited estimate, or averaging multiple published estimates, can lead to significant errors. Eliminating unsubstantiated and redundant estimates, selecting only the highest quality data-based estimates, addressing sources of under-ascertainment, and employing appropriate statistical techniques can yield estimates that are more credible for market-sizing purposes.