Sampling Design in Information Gathering

A systems analyst must follow four steps to design a good sample:

1. Determine the data to be collected or described.
2. Determine the population to be sampled.
3. Choose the type of sample.
4. Decide on the sample size.

These steps are described in detail in the following subsections.

Determine the data to be collected or described

The systems analyst needs a realistic plan about what will be done with the data once they are collected. If irrelevant data are gathered, then time and money are wasted in the collection, storage, and analysis of useless data.

The duties and responsibilities of the systems analyst at this point are to identify the variables, attributes, and associated data items that need to be gathered in the sample. The objectives of the study must be considered as well as the type of data-gathering method (investigation, interviews, questionnaires, observation) to be used. The kinds of information sought when using each of these methods are discussed in more detail in this and subsequent chapters.

Determine the population to be sampled

Next, the systems analyst must determine what the population is. In the case of hard data, the systems analyst needs to decide, for example, if the last two months are sufficient, or if an entire year’s worth of reports are needed for analysis.

Similarly, when deciding whom to interview, the systems analyst has to determine whether the population should include only one level in the organization or all the levels, or maybe the analyst should even go outside of the system to include the reactions of customers, vendors, suppliers, or competitors. These decisions are explored further in the chapters on interviewing, questionnaires, and observation.

Choosing the type of sample

The systems analyst can use one of four main types of samples, as illustrated in the figure below. They are convenience, purposive, simple, and complex. Convenience samples are unrestricted, nonprobability samples. A sample could be called a convenience sample if, for example, the systems analyst posts a notice on the company’s intranet asking for everyone interested in working with the new sales performance reports to come to a meeting at 1 P.M. on Tuesday the 12th. Obviously, this sample is the easiest to arrange, but it is also the most unreliable. A purposive sample is based on judgment.

Four main types of samples the analyst has available

A systems analyst can choose a group of individuals who appear knowledgeable and who are interested in the new information system. Here the systems analyst bases the sample on criteria (knowledge about and interest in the new system), but it is still a nonprobability sample. Thus, purposive sampling is only moderately reliable. If you choose to perform a simple random sample, you need to obtain a numbered list of the population to ensure that each document or person in the population has an equal chance of being selected. This step often is not practical, especially when sampling involves documents and reports. The complex random samples that are most appropriate for the systems analyst are (1) systematic sampling, (2) stratified sampling, and (3) cluster sampling.

In the simplest method of probability sampling, systematic sampling, the systems analyst would, for example, choose to interview every kth person on a list of company employees. This method has certain disadvantages, however. You would not want to use it to select every kth day for a sample because of the potential periodicity problem. Furthermore, a systems analyst would not use this approach if the list were ordered (for example, a list of banks from the smallest to the largest), because bias would be introduced.

Stratified samples are perhaps the most important to the systems analyst. Stratification is the process of identifying subpopulations, or strata, and then selecting objects or people for sampling in these subpopulations. Stratification is often essential if the systems analyst is to gather data efficiently. For example, if you want to seek opinions from a wide range of employees on different levels of the organization, systematic sampling would select a disproportionate number of employees from the operational control level. A stratified sample would compensate for this. Stratification is also called for when the systems analyst wants to use different methods to collect data from different subgroups. For example, you may want to use a survey to gather data from middle managers, but you might prefer to use personal interviews to gather similar data from executives.

Sometimes the systems analyst must select a group of people or documents to study. This process is referred to as cluster sampling. Suppose an organization had 20 help desks scattered across the country. You may want to select one or two of these help desks under the assumption that they are typical of the remaining ones.

Deciding on the sample size

Obviously, if everyone in the population viewed the world the same way or if each of the documents in a population contained exactly the same information as every other document, a sample size of one would be sufficient. Because that is not the case, it is necessary to set a sample size greater than one but less than the size of the population itself.

It is important to remember that the absolute number is more important in sampling than the percentage of the population. We can obtain satisfactory results sampling 20 people in 200 or 20 people in 2,000,000.

Contents

Determine the data to be collected or described

Determine the population to be sampled

Choosing the type of sample

Deciding on the sample size

Related: