Privacy and Health Research

From:
William W. Lowrance, Ph.D.

3. Data from Research, Research on Data

Contemporary health research is generating a multitude of benefits for humankind, and the future benefits look at least as promising. The following sketches can hardly do justice to the myriad complex activities. But they indicate some of the research purposes and approaches, the character of the data, and the privacy-protection problems involved.

As the above title indicates, health research generates new data by observation and experiment, but also—in part because its questions are of such an "applied," practical nature—it often proceeds by analyzing data that were originally collected for another purpose. The two approaches can have different implications for privacy.

The purposes of research are many, and they overlap. Research is conducted:

RESEARCH TO ADVANCE BASIC BIOMEDICAL SCIENCE

Basic research develops the fundamental science that underpins all applied research. It uses every experimental approach possible, every kind of instrumental observation, every epidemiological and other analytic technique. It uses social-scientific methods where these can illuminate basics. It studies simplified "model" systems, in search of insights and techniques that will help study (messier) natural systems. It develops methods.

Much of the task of basic research is to study baseline functioning and health: metabolic mechanisms, hormonal controls, immune responses, and the phenomena of conception, inheritance, development, cognition, memory, and aging. It studies the materials of the body, flows of energy, and how the body interacts with various environments.

Basic research also studies abnormal functioning, and disease states and processes. And it studies bacteria, viruses, fungi, worms, mites, radiation, noise, toxins, dietary factors, stress factors, dusts, allergens—all the agents and risk factors that can affect health.

Much basic biomedical research does not need to use personally identifiable data, but some of course does.

RESEARCH TO KNOW PATTERNS OF HEALTH, DISEASE, AND DISABILITY

All over the world, health and disease are monitored. Starting with prenatal observations and birth data, throughout life health-related measurements and observations accumulate. Analyses are made to portray the "natural history" of diseases and disabilities—how they start, progress in a person or spread to others, and run their courses. Also analyzed are risk factors, and the effects of preventions and interventions. Now genetic patterns in populations are being analyzed much more actively, and in far greater detail, than ever before.

Public-health surveillance, recording the occurrence of events in populations, is one of the longest-established of public-health functions. A representative definition is this one by Stephen Thacker: (39)

Public health surveillance is the ongoing systematic collection, analysis, and interpretation of outcome-specific data for use in the planning, implementation, and evaluation of public health practice. A surveillance system includes the functional capacity for data collection and analysis as well as the timely dissemination of these data to persons who can undertake effective prevention and control activities.

The tasks of surveillance include assembling vital statistics (on births and deaths, and sometimes other events); profiling health status within populations; analyzing patterns of illness and disability, and health risks and risk factors; and studying how people interact with healthcare systems.

Practitioners of surveillance sometimes protest that what they do is not "research." Dr. Thacker insists: "The boundary of surveillance practice excludes actual research and implementation of delivery programs. Because of this separation, epidemiologic cannot accurately be used to modify surveillance."

This is controversial. Perhaps part of the problem is that much surveillance is of necessity based on less-than-fully-standardized, non-validated reports from local physicians and laboratories. Too, in emergencies, such as when surveillance is quickly mounted to trace a contagious disease outbreak, the observations may lack scientific elegance. And generally, surveillance does not itself test a hypothesis (about cause, for instance), but rather passively collects data (although the data generated by surveillance may be used to test a hypothesis). Surveillance may indicate that something is happening, but not necessarily why, or what the factors are. But it does perform highly structured searches for data that, among several purposes, become input for research. This deserves continued discussion. The reason it may matter for privacy is that it can have implications for how the activity is treated under human-subjects protection regulations. (40)

Notifiable disease reporting is a standard public-health activity everywhere. Under the National Notifiable Diseases Reporting System in the U.S., local and State health departments routinely forward case reports, including data on age, gender, and race, on around 50 diseases (measles, mumps, tuberculosis, hepatitis A and B, syphilis...) to the Centers for Disease Control and Prevention (CDC). The CDC then quickly publishes analyses, which help public-health experts and authorities discern patterns of occurrence, and intervene. (41) The CDC receives the case reports with the identifiers removed. Many other such surveillance programs are in operation all over the world. The World Health Organization publishes summaries.

A spirit of openness and reassurance can encourage a community to cooperate. Robert Hahn has proposed this "Ethical checklist for public health surveillance": (42)

  1. Justify the surveillance system in terms of maximizing potential public health benefits and minimizing public and individual harm.
  2. Justify use of identifiers and the maintenance of records with identifiers.
  3. Have surveillance protocols and analytic research reviewed by colleagues, and share data and findings with colleagues and the public health community at large.
  4. Elicit informed consent from potential surveillance subjects.
  5. Assure the protection of the confidentiality of subjects.
  6. Inform health-care providers of conditions germane to their patients.
  7. Inform the public, the public health community, and clinicians of findings of surveillance.

Health statistics programs collect a very large variety and volume of facts, to provide the descriptive backdrop against which society can decide how to optimize interventions, use resources most effectively, and cope with change.

The U.S. National Center for Health Statistics (NCHS), a component of the National Centers for Disease Control and Prevention, analyzes data from existing records, and it gathers data itself via interviews and examinations. The NCHS's National Health Interview Survey periodically collects data on a very wide range of health-status measures and illnesses, and on hospital use, dental care, hearing impairment, nursing home experience, and many other matters; the most recent survey interviewed 120,000 people. To provide data on infant-death risks, NCHS maintains linked files of live births and infant deaths. It also gathers data such as birthweight, which is a reliable index to both maternal and infant health, on a sampled national basis. To help epidemiologists identify subjects for in-depth causal analyses, NCHS assembles selected mortality data from the States into the National Death Index.

In the next round (IV) of its famous National Health and Nutrition Examination Survey (NHANES), the NCHS will examine some 30,000 carefully sampled people to determine health trends. Special coverage will be given to such subgroups as Blacks, Mexican-Americans, low- income persons, preschool children, and the elderly. Like earlier rounds, the Survey will be based on extensive confidential interviews, physical examinations, and laboratory tests. It will amass about 8,000 pieces of data on each subject. These NHANES surveys are data quarries, from which insights derived from data on relatively few people help improve health for countless others in the larger society, including people outside the U.S.

Many of the data collected by NCHS are personally identifiable data. The Center's statute stipulates that personally identifiable data must be carefully protected, and that they may not be used for any purpose other than that for which they were collected unless the data-subject gives new informed consent to the new use. (43) It shares identifiable data with researchers in other U.S. government agencies only if the data-subjects have been informed of and consented to such sharing, and then only under highly restrictive interagency agreements. NCHS never releases identifiable data to anyone else. It does release data for public use, but only after all personal identifiers, and all information that might allow deductive identification of the subjects, have been removed.

Registries. Public-health agencies and other organizations maintain many registries in addition to those for notifiable diseases. Registries usually collect data on individuals' or populations' experience over time, perhaps linking data from several sources (occupational hazard exposure + disease incidence...), and may be cumulated so that the progression of events can be studied. Registries may cover locally important diseases (Lyme disease...), for instance, or occupational illnesses (carpal tunnel syndrome...), or consequences of disasters (Chernobyl...).

A crucial function of registries and other databases can be the identifying and monitoring of health problems of minority and underserved groups, and the effectiveness of interventions. (44) Special precautions may need to be taken in order to protect the identities and rights of minority data-subjects. (45)

RESEARCH TO REDUCE PUBLIC-HEALTH THREATS

Whether or not they are to be considered "research," a classic category of investigations have to do with coping with disease outbreaks and epidemics, and with other emerging or emergency threats.

In this regard it is hard not to think again of the renowned work of the U.S. Centers for Disease Control and Prevention (CDC). The CDC regards itself mainly to be a "public-health practice" agency, as differentiated from a "research" agency. The CDC is depended upon, both by the U.S. and by other countries, for quick response to disease outbreaks, whether classical "food poisoning" or rabies, or known but rare diseases (bubonic plague, yellow fever...), or new or exotic ones (Ebola...). It also charts the waves of shiftily-changing influenza viruses that drift around the globe seasonally, and many other threats. It performs much work outside the U.S., and of course it cooperates with host governments and exchanges data.

A recent report on emerging infectious diseases warns of staggering problems ahead: (46)

Despite historical predictions to the contrary, we remain vulnerable to a wide array of new and resurgent infectious diseases. ... Our vulnerability to emerging infections was dramatically demonstrated in 1993. A once obscure intestinal parasite, Cryptosporidium, caused the largest waterborne disease outbreak ever recognized in this country; an emerging bacterial pathogen, Escherichia coli O157:H7, caused a multi-state foodborne outbreak of severe bloody diarrhea and kidney failure; and a previously unknown hantavirus, producing an often lethal lung infection, was linked to exposure to infected rodent. ...

Methicillin-resistant Staphylococcus aureus, a common cause of hospital infections, may be developing resistance to vancomycin; penicillin resistance is spreading in Strepto- coccus pneumoniae; cholera will likely be introduced into the Caribbean islands from the current pandemic in Latin America, and the new strain, Vibrio cholerae O139, is spreading throughout southern Asia.

To combat these as well as the many more classically known infectious diseases, a great many personally identifiable data will have to be studied, by the CDC and others. And the research will have to be truly international, as much of HIV–AIDS research is.(47)

Survey studies. Some important research on public-health threats involves social- scientific methods. Attitudes are surveyed, to inform public-health promotion and disease prevention campaigns. For example, the U.S. National Institute for Child Health and Human Development has conducted large, highly confidential, interviews of adolescents' sexual attitudes and practices; so has the Centers for Disease Control and Prevention. The consent process is conducted carefully, and the promised confidentiality is guarded closely.

Efforts can be made to respect privacy during data-gathering itself. In a large adolescent health ("Add Health") study under the U.S. National Institute of Child Health and Human Development and other agencies, privacy in interviews involving potentially sensitive questions was afforded by having the adolescents self-administer survey questions via dedicated computer terminals. (48)

RESEARCH TO UNDERSTAND UTILIZATION OF HEALTH CARE

Many approaches are taken for studying how people avail themselves of health care, why they do or don't take various actions, and what factors relate to the behavior. In efforts to enhance women's health, for instance, records are accumulated on Pap smears, breast exams, mammographic screening, obstetric examinations during pregnancy, and countless other interactions with healthcare systems. These data can then be linked with other health data to ask evaluative questions about how well preventive actions or other interventions "work." (What was the women's subsequent health history? How predictive did the screening turn out to be? Could the techniques or frequency of the screening have been improved? What best-practice does it imply for other women?) (49)

Survey interviews are conducted to try to understand the attitudes that shape behavior (for example, to learn why people don't take prescribed medications faithfully). Health services research is performed on the influence of facility location, costs and cost-sharing, and countless other factors that influence use.

RESEARCH TO EVALUATE AND IMPROVE PRACTICES

Much research is performed to evaluate public-health programs, clinical practices, and the effects of innovations. (50)

Outcomes research is performed to analyze "what works best," so to speak, and for what subset of persons and situations, and under what conditions, and perhaps at what costs. (51) Some of this research focuses on individuals, and some on populations. It has to do with whether various preventive or other healthcare services are available, and with how effectively they are used. It examines large samples of real cases, analyzes them statistically, and structures the data in analytic frameworks. Thus outcomes research informs the planning of public-health services delivery (such as smoking-cessation programs for pregnant women), for instance, and the optimal path through branching clinical judgments (what to do for patients with gallstones...).

Clinical practice guidelines have been prepared by many governmental, professional, patient, and managed-care organizations, based on outcomes studies and other evaluative research.

Leading roles have been played by the U.S. Agency for Health Care Policy and Research, which has prepared guidelines on problems ranging from sickle cell disease, to middle ear infections in children, to prostate enlargement. The Agency continues to work to advance the methods of practice evaluation and improvement. (52)

Drug utilization review, the analysis and critiquing of the use of pharmaceuticals in particular clinical settings (such as the use of antibiotics before and after surgery), provides facts that, along with cost-effectiveness and other considerations, help optimize use of drugs. Such review, along with other factors, supports the development of formularies, lists of pharma- ceuticals that are approved for dispensing in the institution, and recommendations for use. Such review may determine whether the drugs qualify for cost reimbursement. Analogous studies are performed on surgical techniques, anesthetics, and use of diagnostic and other services.

Follow-up tracking follows the consequences of interventions. The U.S. Food and Drug Administration requires tracking of people in whom medical devices (such as artificial joints, cardiac pacemakers, heart valves, breast implants, and testicular prostheses) have been implanted. The patients are urged to keep themselves registered. The tracking enhances the patients' care, in that it allows communication with them to advise of relevant new knowledge or warn them to see their doctor or take other protective steps. Of course also it provides data for research on patterns of use, outcomes, costs, patient attitudes about the implants, and other factors, which helps improve the devices and their use.

Quality-of-life studies and patient-attitude surveys use interviews, focus-group discussions, and other social-scientific methods to enquire into the valuations people make of various health states and treatments.

RESEARCH TO MAKE EFFECTIVE INNOVATIONS

A prime example of innovation is the elaborate work of developing and improving the use of pharmaceuticals, medical devices, diagnostic instruments and tests, vaccines, and other "tools" of health care.(53) After much preliminary screening, an experimental entity or procedure is subjected to a long series of clinical trials, perhaps on tens of thousands of volunteers in many countries, to evaluate its efficacy, risks, and other attributes. Refinements are made, and many evaluations are conducted. Eventually the sponsoring company or agency submits the data to government regulatory authorities—in the U.S., the Food and Drug Administration (FDA)—to be reviewed for licensing.

In this process huge quantities of personally identifiable data are amassed. The raw data are collected in clinical settings. Then the data are transferred by the physician–investigators to the sponsor, usually assigning a key-code pseudonym to the data first, which allows tracing-back to the subject via the physician. The sponsor analyzes and prepares the key-coded data for regulatory submission, and transfers the data, still (or re-) key-coded, to the regulatory agency for review. All of this is conducted under international guidelines of good clinical practice, and under the various national human-subjects regulations and regulatory statutes. In the U.S. the confidentiality of these data are covered by regulatory controls and protected by the Federal Privacy Act.

The FDA audits selected trials, usually ones that are considered pivotal to the regulatory decisions. In 1995, for instance, specially credentialed inspectors from its Center for Drug Evaluation and Research conducted over 400 inspections, in each audit reviewing all of the subjects' records including the consent forms and IRB records. If they must take photocopies of personal data away from the site, they first remove the identifiers. They conduct inspections of sites in other countries, through arrangements made by the product sponsors.

After an innovation becomes licensed for general use, research continues. In pharmacovigilance, the company and regulators watch for previously unknown effects of drugs, and respond quickly to spontaneous adverse-event reports communicated by doctors, patients, or others. These data usually are identified by the patient's initials and physician's name. To allow tracing-back to the patient, identifiability—at least key-coded, at least back to the physician— must be preserved, in case it becomes scientifically necessary to review the full medical record and circumstances. The FDA "MedWatch" program, which collects the adverse- event reports for drugs and medical devices, received around 170,000 reports last year. MedWatch shares the identity of the reporting physicians with the manufacturers unless a physician requests it not to. After extensive evaluation these data inform decisions about keeping the product on the market, or revising uses, formulations, dosing, route of administration, labels, or packaging.

Similarly, a "Vaccine Adverse Event Reporting System" (VAERS), administered jointly by the FDA and the Centers for Disease Control and Prevention, collects and analyzes reports on vaccines, again keeping the patients' identity confidential. A prime focus is vaccination of children—to provide feedback that helps improve the vaccines as preventive tools, and to guide public-health missions in ensuring high rates of vaccination. The data are made available for research, after all identifying information is removed. (54), (55)

Postmarketing surveillance may be carried out for a variety of purposes, to learn more about the innovation's medical effects, both beneficial and harmful, as it is tested by wider natural experience. Pharmacoepidemiology, "the study of the use of and the effects of drugs in large numbers of people," is a prime tool in postmarketing research on medicines. (56) Similar techniques apply to surgery and other interventions.

RESEARCH TO ANALYZE ECONOMIC FACTORS

As every newspaper reader is aware, every aspect of health care now is being subjected to economic analysis—to size up the costs of illness and costs in specific episodes of care, evaluate cost-effectiveness of different interventions, see what effects various cost-related incentives have, and understand the component costs in healthcare systems. Virtually every healthcare institution and payor is performing such analyses.

Much of this economic research can be performed on anonymized/aggregated or key- coded data, but detailed analyses of individual patient experiences and the costs incurred may require the examination of personally identifiable data. Naturally much of such research draws at least partly on data in the large databases of healthcare payors. (57), (58)

RESEARCH TO APPRAISE MARKETS

Research on healthcare markets has to be noted here because often such market research now is being performed by, or for, units of organizations that have access to personal data collected for clinical research or disease management.

Obvious examples are the pharmaceutical enterprises, which analyze patterns and future projections of disease, the prescribing patterns of physicians and healthcare organizations, and the economic market for their current products and those under development. Much of this is no different from the market research performed by all businesses. Some is conducted by service companies that gather data, such as dispensing data from pharmacies, and convey them, in nonidentified form, to the drug companies. But what is special is that parts of the large firms may potentially have access, through the information amassed in their main innovative R&D research, to personally identifiable health data.

Several business developments of the past few years have raised this issue. Large research-based pharmaceutical firms have merged with large, highly computerized pharmacy supply companies to form pharmacy-benefit management businesses. Pharmaceutical companies have formed disease-management businesses (that is, supplying services, under contract, directly to patients such as diabetics). And pharmaceutical firms have acquired or formed networked physician informatics businesses. All of these activities are bringing these large companies much closer to patient care, and to large volumes of patient data.

May the commercial divisions of these companies access the personal healthcare data (such as the diabetics' data) for market research, or even carry out direct marketing to patients? Inversely, may the traditional R&D units (those that develop drugs, devices, and diagnostics) access the personal data collected by the affiliated pharmacy-benefit or disease-management businesses, to profile users of products or perform outcome studies or other analyses?

The temptations are obvious. A recent report from a panel of the National Research Council observed: "In many of these cases, specific agreements have been established to limit data sharing among affiliated companies, but the complex overlaps make security more difficult to ensure." (59) Are effective barriers in place among these activities to protect the data-subjects? The companies should be urged to attend carefully to these matters.

(Also, commercial units of companies conduct product-acceptance research on health- related products, to gauge potential customers' opinions of product design, convenience, cost, packaging, and information. But this is little different from product research in other industries, except if the survey participants are identified as having particular diseases or disabilities. Informed consent and promises of nondisclosure can be incorporated.)


[Previous]

[Table of Contents]

[Next]

Comments/suggestions about the HHS Data Council web pages should be directed to the Data Council Web Master.

"" Return to the Data Council home page .

Last updated 7/23/97.