Use and Abuse of Sampling in Sales and Use Tax Audits

By Will Yancey, PhD, CPA and Roger Pfaffenberger, PhD



Originally completed in September 1997.  This version published on www.willyancey.com/sampling-cost97.htm with minor revisions in April 2001. Published or reprinted in:

Introduction

I.  Current Extent of Sampling by State Auditors

II.  Admissibility of Statistical Evidence in Litigation

III.  Statistical Sampling Issues

IV.  Opportunities for Applied Research


Introduction

This paper identifies some important legal and audit issues in the sales and use tax audits of large firms. The US Census Bureau reports that in 1996 state and local governments collected 206 billion dollars from all types of sales, use, and gross receipts taxes which represented 30 percent of all the tax revenues received by those governments. These transaction taxes dwarf state and local corporate income taxes which amounted to only 32 billion dollars, or less than five percent of total state and local tax revenues. Given the political resistance to increasing income taxes, governments are seeking to increase collections from the sales and use tax base. Despite the importance of sales and use taxes, they have received less attention from scholars than income and property taxes.

The central problem with the sales and use tax audits of large firms is that there are so many transactions that detailed examination of every single transaction is not feasible.

The auditors attempt to perform their work more efficiently by detailed testing of a small sample of sales or purchase invoices and project the sample results over the entire audit period. Frequently the state auditors do not use modern statistical methods for selecting the sample and projecting the results. Although thousands of articles and books have discusses the application of statistics to numerous other areas, few have specifically addressed the problem of sales and use tax audits.

The intent of this paper is to identify significant legal and statistical issues in the audits of large firms that are pervasive across the states. The body of the paper is organized as follows. The first section describes the current sampling practices of state sales and use tax auditors. The second section summarizes how courts have determined the admissibility of statistical evidence, and includes examples from several specific states. The third section discusses sampling and estimation issues from a statistician’s perspective. The final section summarizes the opportunities for research by scholars, practitioners, and policy makers from many disciplines.

I. Current Extent of Sampling by State Auditors

Wisconsin’s 1996 Survey

In 1996 the Wisconsin Department of Revenue conducted a comprehensive survey on sampling in field audits by the 45 state governments with a state sales tax. George Larscheid, Field Audit Section Chief for the Department, issued a comprehensive detailed report on the survey responses. Most of the states use block sampling where a few days, weeks, or months are selected to represent all transactions over a multi-year examination period.

The nineteen states reporting routine use of statistical sampling were Arizona, California, Colorado, Connecticut, Illinois, Iowa, Kansas, Maryland, Minnesota, Mississippi, Missouri, New York, Ohio, Pennsylvania, South Carolina, Tennessee, Texas, Wisconsin, and Wyoming. The use of the term “statistical sampling” in the Wisconsin survey is not consistent with the statistician’s definition of that term. As discussed in the third section of this paper, statisticians require that valid statistical sampling procedures include an estimate of the precision or confidence interval in the sample estimate. In the Wisconsin survey, five states (Maryland, Mississippi, Ohio, Texas, and Wyoming) indicated they did not compute the confidence interval or another measure of precision. Thus, these five states are using methods that a statistician would define as nonstatistical sampling rather than statistical sampling.

California, Illinois, and New York have the most experience with statistical sampling. Each of these states reported performing more than 100 statistical sampling audits per year, and has procedures for determining sample size and for producing confidence interval estimates.

More details are available in the Survey on Sampling in Field Audits by George Larscheid. Publication of an updated version of this survey every few years would be very beneficial to state governments, taxpayers, and consulting firms.  [Update:  The Federation of Tax Administrators is conducting a survey of tax audit sampling practices by the states.  The survey results should be publicly available by the end of 2001.]

Implementation Problems

Even if a state decides to use statistical sampling methods, formidable implementation problems remain. Although most entry-level staff complete some college statistics courses, they are frequently unable to apply the statistical methods they learn in school to sales and use tax audits. The states often lack the financial resources to provide enough powerful personal computers for field staff who are auditing the large taxpayers. Many state examiners who learned sampling techniques from several years of experience leave state employment for higher-paying positions in consulting firms or industry.

Most state tax agencies have a designated sampling specialist in their headquarters staff who is expected to provide training and consultation in particular cases. These sampling specialists often lack the time to update software and training programs because of hiring freezes. As the headquarters sampling specialists are requested to assist the field staff and appeals officers with large and complex cases, little time is left to improve training materials. Some states are operating with sales and use tax audit training manuals that have not been revised for more than a decade.

II. Admissibility of Statistical Evidence in Litigation

Non-tax Litigation

Statistical evidence is admitted in a wide variety of litigation including toxic drugs and chemicals torts, employment discrimination, drug testing, and financial securities. Extensive case law relying on statistical evidence exists in both state and federal courts. The Federal Judicial Center commissioned peer-reviewed articles from experienced experts and published them in theReference Manual on Scientific Evidence (1994), and on the World Wide Web at http://www.fjc.gov/EVIDENCE/science/sc_ev_sec.html. [Update:  The second edition was published in 2000 and is online at http://air.fjc.gov/public/fjcweb.nsf/pages/16.] Some courts appoint independent statistical experts to help the courts rule on the admissibility of evidence submitted by the experts of competing parties. A bibliography of print articles and treatises on statistical evidence in litigation is on the World Wide Web at http://willyancey.com/statistical_evidence.htm.

Toxic drug tort litigation has resulted in widely-cited rulings on the use of scientific evidence in federal and state courts.In Daubert v. Merrell Dow, 509 US 579, (1993) the US Supreme Court held that the trial judge must consider many factors in considering whether to admit expert scientific testimony. These factors include whether the theory or technique has been subject to peer review and publication and attracted widespread acceptance within a relevant scientific community. The Texas Supreme Court reached a similar decision for the admissibility of expert testimony in state courts in DuPont v. Robinson, 923 SW 2d 549, (1995, rehearing overruled, 1996).

As the case law applying statistical evidence in tort claims becomes more developed, it will be extended to other areas. Some state courts may hold that the standards of evidence in common law tort claims are different than the standards of evidence for state tax auditors operating under their state’s statutory authority. The legislatures can amend their statutes specifying the methodology for determining state tax deficiencies or can grant rule-making authority to the state tax agency.

Sales and Use Tax Audit Litigation

Few court decisions on sampling methodology in sales and use tax cases are ever reported. State tax agencies are reluctant to allow contested sales and use tax cases to leave the administrative appeals process and possibly result in an appellate court opinion directly attacking the state’s methodology. Anecdotes from experts around the country suggest states will settle tax audit cases involving sampling methodology disputes rather than allow them to be heard in a district court. The remainder of this section provides examples from sales and use tax audit cases in three states: California, Ohio, and Texas.

California example

In 1957, R. Clay Sprowls, a statistics professor at the University of California, Los Angeles, published “The Admissibility of Sample Data Into a Court of Law: A Case History,” 4 UCLA Law Review 222 (1957). Although there have been changes in law and statistics since this article was written, it remains an excellent case study for illustrating the fundamental issues in the application of statistical sampling to sales and use tax cases.

Professor Sprowls was hired by Sears in the case of Sears, Roebuck and Co. versus the City of Inglewood, tried in Los Angeles Superior Court in 1955. The City of Inglewood imposed a half-percent sales tax on sales made by stores to residents living within the city limits. Sears’ internal auditors discovered the Sears store in Inglewood had incorrectly estimated the amount of tax-exempt sales made to out-of-city residents. A sample of a few days was performed and an estimate was made that Sears had overpaid the sales tax in the amount of $27,000 for the eleven calendar quarters beginning January 1, 1949. After the City of Inglewood refused Sears’ refund claim for $27,000, Sears sued the city.

Professor Sprowls conducted a statistical sample in support of Sears’ refund claim. He randomly selected 33 days out of 826 working days and had the day’s sales slips examined for in-city or out-of-city addresses. On the basis of this sample he estimated the mean amount of refund for the entire 826 days was $28,250 with a 95 percent confidence interval between $24,000 and $32,400. Professor Sprowls testified before the judge, but the judge held Sears must prove its refund claim on each individual transaction rather than on the sample information. Subsequently, a complete audit of all transactions over the entire 826 days was performed and the actual refund amount was determined to be $26,750.22.

In his article, Professor Sprowls pointed out that the actual number determined by a complete audit was well within his confidence interval ($24,000 to $32,400), and was quite close to the original $27,000 claim submitted by Sears. The city, taxpayer, and court system could have avoided the substantial time and expense of the litigation by accepting the sample estimate. Professor Sprowls urged attorneys, judges, and statisticians to work together to bring about the acceptance of sample data as evidence in courts of law.

The Sprowls article illustrates how little progress has been made over the past 40 years. Courts and tax collectors still have trouble understanding the concept of confidence interval estimation. States are willing to use sampling for determining deficiencies, but not for estimating refund claims. Of course, the amounts at stake are much larger today. Sample stratification techniques are available to apply to the audits of large complex multi-location taxpayers.

Ohio example

In Lubrizol Corporation v. Tracy, Ohio App. LEXIS 3092,1995 WL 453125 (Ohio Appeals, 11th District, 1995), the taxpayer was protesting a use tax deficiency assessment covering 1984 through 1986. The Ohio Tax Commission relied on Ohio Revised Code Sec. 5739.13(A), paragraph 5 which states,

When information in the possession of the commissioner indicates that the amount required to be collected or paid under this chapter is greater than the amount remitted by the vendor or paid by the consumer, the commissioner may audit a sample of the vendor's sales or the consumer's purchases for a representative period, to ascertain the per cent of exempt or taxable transactions or the effective tax rate and may issue an assessment based on the audit. The commissioner shall make a good faith effort to reach agreement with the vendor or consumer in selecting a representative sample period.

Observe that the Ohio statute specifically authorizes the use of a “representative sample period”, which could be interpreted as one or two months of transactions selected from an examination period covering several years. Furthermore, the Ohio statute makes no mention of statistical sampling.

Lubrizol contended that the Ohio Tax Commission should have reviewed all transactions in the audited period for both tax underpayments and tax overpayments. If overpayments had been included in the sample, they would have offset some of the underpayments and reduced the total estimated tax, penalty, and interest. However, the state court held against the taxpayer and found the statute required that the auditor was only required to search for underpayments.

Texas example

In Bullock v. Foley Brothers, 802 SW2d 835, (Texas Court of Appeals-Austin, 1990, rehearing overruled 1991), the taxpayer failed to show that the Texas Comptroller relied on invalid sampling methods for a use tax audit covering the period 1979 to 1984. The Comptroller’s defense relied on Vernon’s Texas Statutes Annotated, Sec. 111.0042 (e), effective May 25, 1983, which states:

If the taxpayer demonstrates that any sampling method used by the comptroller was not in accordance with generally recognized sampling techniques, the audit will be dismissed as to that portion of the audit established by projection based upon the sampling method, and a new audit may be performed [emphasis added].

The term “generally recognized sampling techniques” has not been defined in a professional standard issued by any organization of expert statisticians or auditors. Generally accepted accounting principles (GAAP) and generally accepted accounting standards (GAAS) are promulgated by various accounting standard-setting organizations. However, the leading statistics organization, the American Statistical Association, has not issued professional standards on concepts such as “generally recognized sampling techniques.” The Texas statute does not specify how a taxpayer can demonstrate the sampling method was inappropriate.

Anecdotal comments from various people indicate that the Texas Comptroller’s Office is interested in improving its statistical procedures and training without prolonged litigation. The Comptroller nearly always settles cases involving sampling methodology before a trial occurs in state district court.

III. Statistical Sampling Issues

Statistical Sampling Concepts

The goal of sales and use tax audits is to estimate the total amount of underpaid tax during a specified audit period, such as 3 to 5 years. The population of interest is all transactions to which sales and use taxes apply during the audit period. For large business taxpayers with millions of sales or purchases per year, it is too costly to conduct a complete census examining all transactions in the population. Thus, a sample of transactions from the population is selected for detailed testing. Sampling risk is the risk that the auditors’ estimate based on a sample might be different than the population parameter determined by examining every single item in the population.

In nonstatistical sampling, the auditors estimate sampling risk by relying on professional judgment. The severe limitation of nonstatistical sampling is that it does not allow the auditor to make a quantitative estimate of sampling risk. An example of nonstatistical sampling is block sampling in which the auditors select a few days or weeks from the population which the auditor or taxpayer deems to be representative of the entire population. By not taking sample transactions over the entire audit period, block samples run the risk of producing sample information that is relevant only to the period for which the sample was taken. If the tax deficiency rate in the sample differs significantly from the population, the block sampling method will produce results that are not valid.

Statistical sampling methods provide a quantitative estimate of the sampling risk. Statistical sampling requires that the person selecting the sample relies on a random sample selection process rather than his or her judgment about the extent to which the sample represents the population. The statistical sample might not be a good representation of the population in some instances, but this sampling risk can be quantified using statistical formulas derived from the theory of probability.

The Sprowls example discussed above is a good example of statistical sampling. Sprowls randomly selected 33 days out of a population of 826 working days rather than relying on a person’s judgment about representative days. (A modern computerized sales ledger system could have enabled Sprowls to randomly sample from a population of thousands of individual transactions rather than the 826 days.) He made a point estimate that the mean refund for the population was $28,250, and produced a 95 percent confidence interval estimate of between $24,000 and $32,400. Thus, he was 95 percent confident that the true amount of the refund in the population of all sales was between $24,000 and $32,400.

Fundamental to statistical sampling is the concept that the probability of selecting a particular item from the population is known before the sample is drawn. In simple random sampling, each transaction in the population has the same chance as any other transaction for being selected in the sample. If the number of transactions in the population under audit is large, stratified random sampling can be used which stratifies the population into subgroups according to specified attributes. For example, a population of one million transactions with a maximum transaction amount of $100,000 may be subdivided into five mutually-exclusive strata, such as $0 to $19,999, $20,000 to $39,999, and so forth. The complete sample consists of aggregating samples randomly drawn from each stratum. The probability of selection could differ in each stratum. For example, the probability of selection could range from 0.01 percent in the first stratum to 10 percent in the fourth stratum. A complete census, which is a 100 percent sample, could be conducted for all transactions in a particular stratum where individual errors could be a significant part of the estimate for the entire population, such as all transactions over $90,000 dollars in value.

Sampling in Financial Accounting Statement Audits

Sales and use tax audits are inherently different than financial accounting statement audits. The objective of a sales and use tax audit is to accurately and precisely estimate the dollar amount difference between the sales and use tax liability reported by the taxpayer and the amount estimated by the state. The state auditors attempt to plan their activities so that the amount of tax, penalty, and interest deficiency assessment exceed the cost of the audit process. The objective of a financial accounting statement audit is to determine whether or not the financial statements are materially correct. In audits of the financial statements of large corporations, the materiality threshold may be several million dollars of revenue, expenses, or net assets.

Statement of Auditing Standards Number 39 (SAS 39), issued by the Auditing Standards Board of the American Institute of Certified Public Accountants, permits financial auditors to use either statistical or nonstatistical sampling. However, SAS 39 specifically requires the auditor to consider materiality in determining sampling risk. SAS 39 cannot be simply translated from financial audits to tax audits without considering the change in materiality and other goals.

To reduce audit costs, financial statement auditors are relying largely on compliance tests of internal control rather than on substantive testing of transactions. In sales and use tax audits, the auditor must be skilled at the process of substantively testing the transactions and must be skilled in the competent use of statistical sampling procedures. Anecdotal evidence from sources in academe and practice indicates financial statement auditors in the 1990’s appear to have less experience and training with statistical sampling than auditors in earlier periods.

Population and purpose

As stated previously, the purpose of most sales and use tax audits based on samples is to estimate the total amount of underpaid taxes, while ignoring tax overpayments. The focus on estimating the total amount of underpaid taxes has been supported by some state courts, such as the Lubrizol case discussed previously. From a statistical point of view, the appropriate objective is to estimate the difference between the tax liability reported by the taxpayer and the state auditor’s estimate of the liability for the population. Positive estimates indicate tax underpayment; negative estimates indicate overpayment; and insignificantly different from zero estimates indicate no change. For example, the Internal Revenue Service uses audits to determine if an income tax return is materially correct and considers both underreported and overreported income in the audit.

The treatment of any tax overpayments observed in the sample is important due to the sample leverage in sales and use tax audits. Suppose, for example, that 100 transactions are sampled from a population of 100,000 transactions and that the sample contains 98 transactions for which taxes were properly paid, one transaction for which a $50 tax underpayment occurred, and one transaction for which a $30 tax overpayment was made. If the overpayment is treated as “no underpayment,” then the average underpayment in the sample is $50/100 = $0.50 per transaction. Multiplying the sample average transaction underpayment by the population size produces the estimated population tax underpayment: (100,000)($0.50) = $50,000. If the credit for the overpayment is applied, then the average underpayment in the sample is ($50 - $30)/100 = $0.20 per transaction and the estimated population tax underpayment is $20,000. The $30,000 difference in the assessment is considerable because any tax underpayment or overpayment has considerable leverage on the estimate due to the high percentage of sampled transactions for which there was neither a tax underpayment nor overpayment.

Stratification

Stratified random sampling is commonly used in sales and use tax audits due to its greater efficiency when compared to simple random samples of equal size. Stratified random sampling is most beneficial when the strata are constructed so that the transactions within a stratum are more homogeneous than the population as a whole. Ideally, the subgroup of transactions with the highest probability of error would be concentrated within one stratum, and the subgroup with the least probability of error would be in a different stratum. The important characteristics to be studied, error rate and magnitude, should be relatively similar within a stratum and relatively different between strata. When prior estimates of the error rates in each stratum are available, the statistician can design a sampling plan with different sampling rates for each stratum.The total number of transactions to be sampled will be smaller under stratified random sampling than with simple random sampling of the unstratified population.

Most often, strata are determined by using the dollar amount of the transactions as the stratification variable. For example, if transaction amounts have a maximum amount of $1,000,000, then strata are constructed over the interval from $0 to $1,000,000 (e.g., $0 to $49,999; $50,000 to $99,999; $100,000 to $199,999; and so forth). Designing a stratified sampling plan requires answers to three questions:

  • How many strata should there be?
  • What should be the strata class limits?
  • How large a sample should be taken from each strata?

  • The answers to these questions require considerable competence and experience in statistical sampling. For example, there are several ways to determine the strata sample sizes. They can be chosen to be proportional to the numbers of transactions in the strata, or they can be chosen to minimize the variation of the estimate of tax underpayment in each stratum.

    In addition, variables other than the dollar size of the transaction can be used as stratification variables. For example, vendor groupings or receiving locations could be used in use tax audits. In sales tax audits, stratification variables such as the destination of goods and services, time period of transactions, or state of origin could be used.

    Sales and use tax audit managers prefer to use stratified random sampling due to its statistical advantages over simple random sampling. Unfortunately, stratified sampling methods often stretch the resources and statistical competency levels of state auditors with the consequence that mistakes are made in its implementation and the interpretation of results.

    Sample Size

    The determination of the size of the sample to take in a sales and use tax audit is an important consideration as the sample size affects the precision of the estimator of the total tax underpayment in the population. Sample size determination is inevitably a trade-off between the cost of sampling and the precision of the estimator. Increasing sample size results in both a more costly audit and a more precise estimate.

    Given cost and estimator precision requirements, required sample sizes can be estimated for most sampling designs. In stratified random sampling, for example, it is possible to determine the overall sample size and to allocate the sample size to the strata based on audit cost considerations (e.g., audit cost per transaction and total audit budget).

    Most sampling designs used in sales and use tax audits use predetermined sample sizes. That is, the sample size is determined before the sample is drawn. Recently, there has been some interest in applying adaptive sampling plans for tax audits. In adaptive sampling designs, the sampling process is continued until a specified number of sampled units possessing some attribute is observed. For example, it may be known before the sample is taken that the incidence of transactions with tax underpayments is small in a certain population. As a result, a predetermined sample size of 40 transactions may result in finding no transactions for which tax is underpaid. In adaptive sampling, the process of sampling continues from the population until, say, two transactions are found that contain tax underpayments. A consequence of adaptive sampling is that the sample size is a random variable (i.e., cannot be determined a priori). Further, adaptive sampling plans require a considerable amount of statistical expertise to be used competently.

    Data Collection

    In sales and use tax audits, three outcomes are possible for each sampled transaction. First, the transaction has no error; all appropriate taxes were properly collected. Second, the transaction contains a deficiency. Typical causes of deficiencies are treating a transaction as if it were tax exempt when it is not, and collecting taxes at an rate that is too low. Third, the transaction requires a credit. Typical causes of credits are treating a tax exempt transaction as if it were taxable, and collecting taxes at a rate that is too high. The correct determination of tax deficiency can be particularly difficult in sales tax transactions that are subject to state and several types of local taxes (e.g., county, city, transit authority).

    Missing transactions create a troublesome concern in collecting sample data. How should a missing transaction be treated? The most frequent approach is to replace the missing observation with the observation from an additionally sampled unit. Alternatively, missing observation estimation statistical techniques can be used A sample that contains several missing transactions is certain to raise a red flag in the eyes of the auditor.

    Estimation

    Point and interval estimates result from probability samples. A point estimate is a single number that is chosen to best estimate an unknown population parameter. In sales and use tax audits, the targeted population parameter is typically the total amount of underpaid taxes. For simple random sampling, for example, the sample mean can be used to estimate the population parameter provided that the total number of transactions in the population is known.

    A point estimate of a population parameter does not provide information about the reliability of the estimator. To do that, it is necessary to provide a confidence interval estimate of the population parameter. We might say, for example, that our point estimate of the total amount of tax underpayment by a corporation in a three year period is $5.0 million, and that we are 95% confident that the total amount of tax underpayment is between $4.5 and $5.5 million. The width of the confidence interval provides a measure of the reliability of the point estimator. As the width of the confidence interval increases, the reliability we place in the point estimator value decreases.

    In the popular press, the confidence interval is generally reported in terms of the margin of error of the estimator (e.g., “It is predicted that a political candidate will receive 60% of the vote with a margin of error of 4%). The margin of error is approximately the amount to be added and subtracted from the point estimate to produce a 95% confidence interval estimate (e.g., for the example above, the 95% confidence interval is from 56% to 64% of the vote). The margin of error or the confidence interval half-width is determined based on the sample size, the degree of confidence required, and on the variation among the sampled observations. A statistician views a point estimate without a confidence interval estimate as deficient, for there is no indication provided about the reliability of the estimate. Yet, at the present time, only 14 states compute confidence interval estimates in sales and use audits based on statistical samples.

    Estimating the confidence interval requires an assumption about the probability distribution function of errors in the population. The typical assumption underlying the most commonly used statistical formulas is that the errors are distributed according to the normal distribution. The normal distribution is the “bell curve” illustrated in statistics textbooks.

    Presuming a normal distribution for sales and use tax error estimation is not reasonable for most sales and use tax audits. Normality assumes the errors are distributed continuously in small increments around the mean, such as one cent, ranging from minus infinity to positive infinity. However, in use tax samples, the errors are typically bunched in discrete groups. For example, in a use tax purchase invoice population, 95 percent of the population and sample have zero errors; 3 percent incorrectly paid no tax (deficiency); 2 percent overpaid tax (refund); and virtually none partially paid tax. In this use tax example, the distribution is best modeled by a discrete three-state model (no error, deficiency, and refund), instead of a continuous distribution assumed by the normal distribution.

    Determination of deficiency

    Statistical inference procedures can be useful in sales and use tax audits for the determination of deficiencies provided that valid and reliable procedures are used. For example, confidence interval and hypothesis testing methods can be used to infer that the state’s estimate of the total amount of tax that should have been paid is different than the amount reported by the taxpayer. As an illustration, suppose the taxpayer reported that $2 million of use taxes was paid on a population of 100,000 purchases. Based on a sample audit of these purchases, the state estimates that $2.2 million should have been paid and reports a 95% confidence interval estimate of $1.8 million to $2.6 million. Since the amount the taxpayer paid ($2 million) falls within the confidence interval estimate, there is insufficient sample evidence to reject the hypothesis that the true amount of taxes to be paid is different from $2 million, the amount paid by the taxpayer.

    The confidence interval can be used in another way to assess tax underpayment or overpayment. Instead of using the point estimate, the lower 95% confidence interval limit could be used for assessment. In the above illustration, the lower 95% confidence limit is $1.8 million. Since the taxpayer has already paid $2 million, no deficiency assessment is made since the taxpayer has already paid more than this amount. New York has used the lower 95% confidence limit in its statistical sampling audits as a method of reducing some disputes with taxpayers.

    The use of confidence interval estimates for either hypothesis testing or deficiency estimation is controversial. Texas and some other states argue strongly against using the confidence interval estimate as a basis of assessment. They argue that it benefits the taxpayer at the expense of the state and adds cost to the audit by increasing sample sizes to insure that the confidence interval estimator is valid and reliable.

    IV. Opportunities for Applied Research

    This article has described some important statistical and legal issues that exist for sales and use tax audits of large taxpayers. The Wisconsin survey found that only 14 of the states are currently using statistical sampling.  Although statistical evidence is used extensively in other areas, few court cases are reported about sales and use tax audit sampling procedures in large firms. Although nonstatistical sampling procedures are used extensively, they are unable to quantify the sampling risk that the sample does not represent the population. Many states sample only to estimate tax underpayments (deficiencies) and do not allow sampling for tax overpayments (refunds) that would offset the deficiency assessments.

    Many opportunities exist for applied and interdisciplinary research in sales and use tax audit methodology. Ideally, these issues should be studied and discussed by scholars and practitioners from many disciplines. Legal scholars could survey existing statutes, regulations, and court cases. Statisticians could identify scientifically sound procedures. Public policy analysts could assess the impact of methodology changes on revenue collections and relations between taxpayers and government agencies. Educators could design and implement sampling training programs for staff auditors in government, corporations, and consulting firms. Professional associations could fund research and facilitate the exchange of ideas.