What is an Inference Space?
Inference space can be defined in many ways, but can be generally described as the limits to how broadly a particular results applies (Lorenzen and Anderson 1993, Wills et al. in prep.). Inference space is analogous to the sampling universe or the population. All these terms refer to the largest entity to be described. The term inference space, however, better captures the notion that there are many different factors that help define the extent to which generalizations can be made from data (Lorenzen and Anderson 1993).
An example can help illustrate the factors that can go into defining the inference space for a particular rangeland monitoring effort. Consider the following scenario: a BLM field office is interested in maintaining healthy sage grouse populations and biologists there have concluded that lekking areas are limiting populations and are at risk from disturbance. All known past and current lekking areas have been identified and a random sample is drawn of areas to monitor. In this example, the inference space is defined by the following factors:
- The boundary of the field office – This limits the inference space because no leks outside the field office boundary were considered for sampling. Statistical inference can only be made to sage grouse within the field office.
- The choice to focus on lekking habitat – the results will be applicable only to lekking habitat, and won’t say anything about the condition or trend of other types of sage grouse habitat.
- The choice to limit monitoring to past and present leks – This choice will likely give the best ability to detect changes in lek conditions, but it limits the inference space because nothing can be said about areas in the field office that are not past or present leks.
One important concept for defining the inference space of a study is that within the inference space, every sampling unit (i.e., every location) has a non-zero probability of being sampled. In other words, every location has some chance of being selected for sampling. These selection probabilities can be either equal or unequal, but the inference space is defined by the sampling units that can be selected for sampling.
The concept of inference space is also closely tied to variability and sample size estimation (Wills et al. in prep). As inference space increases, the variability within that inference space generally increases too. Thus you will usually need more samples to detect the same degree of change in a large inference space than in a smaller one.
Limiting/Restricting Your Inference Space
Care must be taken when defining sampling schemes for rangeland assessment and monitoring to insure that you do not unintentionally restrict your inference space. Because the inference space is defined by the collection of sampling units that have a non-zero probability of being selected, making decisions that exclude areas from consideration for sampling restricts the inference space. For instance, consider an allotment that is to be monitored to detect impacts from grazing. A common approach is to locate sampling locations within a specified distance range from high impact areas like water sources because the impact near the feature is very high and too far away from the feature there is little to no livestock use. A corresponding zone could be defined around these features and sample points drawn randomly within this zone. However, in this case, the inference space is no longer the allotment – it is the buffer zone. By excluding the too-close and too-far areas, you have insured that you have no information about these areas. Therefore you cannot infer your data to the entire allotment, only to the areas within it that you sampled. This may seem acceptable because such a buffer zone is “representative” of grazing use. However, such an approach generally relies on assumptions of what is representative, cannot provide appropriate inferences if representative conditions change (e.g., installation of a new water source), and cannot be used to provide inferences for other objectives (e.g., trend of cover of non-native annual grasses in the allotment).
In the example above, the use of unequal selection probabilities (i.e., importance sampling) can help focus sampling efforts on areas most informative to grazing management while still maintaining the desired inference space (i.e., the allotment). In the case of the example, a selection probability layer could be created where the probability of being selected for sampling varies with distance from the impact areas. While this approach to sampling technically introduces bias into the samples, the bias can be corrected for using the selection probability for each location. For more information on sampling with unequal selection probabilities, see the sample_design page.
References
- Elzinga, C. L., D. W. Salzer, and J. W. Willoughby. 1998. Measuring and monitoring plant populations. U.S. Department of the Interior, Bureau of Land Management. National Applied Resource Sciences Center, Denver, Colorado. Download PDF.
- Lorenzen, T.J. and V.L. Anderson. 1993. Design of experiments: a no-name approach. Marcel Dekker, Inc. New York.
- Wills, S. et al. In prep. Designing plot to landscape scale studies of environmental change: key concepts, new tools and application to soil carbon sampling.