在新选项卡中打开链接
  1. Copilot 答案
    Wilcoxon signed-rank test - Wikipedia

    The Wilcoxon signed-rank test is a non-parametric rank test for statistical hypothesis testing used either to test the location of a population based on a sample of data, or to compare the locations of two populations using two matched samples. The one-sample version serves a purpose similar to that of the one-sample Student's t-test. For two matched samples, it is a paired difference testlike the paired Student's t-test (also known as t…

    The Wilcoxon signed-rank test is a non-parametric rank test for statistical hypothesis testing used either to test the location of a population based on a sample of data, or to compare the locations of two populations using two matched samples. The one-sample version serves a purpose similar to that of the one-sample Student's t-test. For two matched samples, it is a paired difference test like the paired Student's t-test (also known as the "t-test for matched pairs" or "t-test for dependent samples"). The Wilcoxon test is a good alternative to the t-test when the normal distribution of the differences between paired individuals cannot be assumed. Instead, it assumes a weaker hypothesis that the distribution of this difference is symmetric around a central value and it aims to test whether this center value differs significantly from zero. The Wilcoxon test is a more powerful alternative to the sign test because it considers the magnitude of the differences, but it requires this moderately strong assumption of symmetry.

    Wikipedia

    The test is named after Frank Wilcoxon (1892–1965) who, in a single paper, proposed both it and the rank-sum test for two independent samples. The test was popularized by Sidney Siegel (1956) in his influential textbook on non-parametric statistics. Siegel used the symbol T for the test statistic, and consequently, the test is sometimes referred to as the Wilcoxon T-test.

    继续阅读

    There are two variants of the signed-rank test. From a theoretical point of view, the one-sample test is more fundamental because the paired sample test is performed by converting the data to the situation of the one-sample test. However, most practical applications of the signed-rank test arise from paired data.

    For a paired sample test, the data consists of a sample . Each data point in the sample is a pair of measurements. In the simplest case, the measurements are on an interval scale. Then they may be converted to real numbers, and the paired sample test is converted to a one-sample test by replacing each pair of numbers by its difference . In general, it must be possible to rank the differences between the pairs. This requires that the data be on an ordered metric scale, a type of scale that carries more information than an ordinal scale but may have less than an interval scale.

    The data for a one-sample test is a sample in which each observation is a real number: . Assume for simplicity that the observations in the sample have distinct absolute values and that no observation equals zero. (Zeros and ties introduce several complications; see below.) The test is performed as follows:
    199 Compute
    299 Sort , and use this sorted list to assign ranks : The rank of the smallest observation is one, the rank of the next smallest is two, and so on.
    399 Let denote the sign function: if and if . The test statistic is the signed-rank sum :
    499 Produce a -value by comparing to its distribution under the null hypothesis.
    The ranks are defined so that is the number of for which . Additionally, if is such that , then for all .

    The signed-rank sum is closely related to two other test statistics. The positive-rank sum and the negative-rank sum are defined by Because equals the sum of all the ranks, which is , these three statistics are related by: Because , , and carry the same information, any of them may be used as the test statistic.

    The positive-rank sum and negative-rank sum have alternative interpretations that are useful for the theory behind the test. Define the Walsh average to be . Then:

    继续阅读

    The one-sample Wilcoxon signed-rank test can be used to test whether data comes from a symmetric population with a specified center (which corresponds to median, mean and pseudomedian). If the population center is known, then it can be used to test whether data is symmetric about its center.

    To explain the null and alternative hypotheses formally, assume that the data consists of independent and identically distributed samples from a distribution . If can be assumed symmetric, then the null and alternative hypotheses are the following:

    Null hypothesis H0 is symmetric about . One-sided alternative hypothesis H1 is symmetric about . One-sided alternative hypothesis H2 is symmetric about . Two-sided alternative hypothesis H3 is symmetric about .

    If in addition , then is a median of . If this median is unique, then the Wilcoxon signed-rank sum test becomes a test for the location of the median. When the mean of is defined, then the mean is , and the test is also a test for the location of the mean.

    The restriction that the alternative distribution is symmetric is highly restrictive, but for one-sided tests it can be weakened. Say that is stochastically smaller than a distribution symmetric about zero if an -distributed random variable satisfies for all . Similarly, is stochastically larger than a distribution symmetric about zero if for all . Then the Wilcoxon signed-rank sum test can also be used for the following null and alternative hypotheses:

    Null hypothesis H0 is symmetric about . One-sided alternative hypothesis H1 is stochastically smaller than a distribution symmetric about zero. One-sided alternative hypothesis H2 is stochastically larger than a distribution symmetric about zero.

    The hypothesis that the data are IID can be weakened. Each data point may be taken from a different distribution, as long as all the distributions are assumed to be continuous and symmetric about a common point . The data points are not required to be independent as long as the conditional distribution of each observation given the others is symmetric about .
    Because the paired data test arises from taking paired differences, its null and alternative hypotheses can be derived from those of the one-sample test. In each case, they become assertions about the behavior of the differences .

    Let be the joint cumulative distribution of the pairs . In this case, the null and alternative hypotheses are:

    Null hypothesis H0 The observations are symmetric about . One-sided alternative hypothesis H1 The observations are symmetric about . One-sided alternative hypothesis H2 The observations are symmetric about . Two-sided alternative hypothesis H3 The observations are symmetric about .

    These can also be expressed more directly in terms of the original pairs:

    Null hypothesis H0 The observations are …

    在 Wikipedia 上阅读更多信息

    继续阅读

    In real data, it sometimes happens that there is an observation in the sample which equals zero or a pair with . It can also happen that there are tied observations. This means that for some , we have (in the one-sample case) or (in the paired sample case). This is particularly common for discrete data. When this happens, the test procedure defined above is usually undefined because there is no way to uniquely rank the data. (The sole exception is if there is a single observation which is zero and no other zeros or ties.) Because of this, the test statistic needs to be modified.
    Wilcoxon's original paper did not address the question of observations (or, in the paired sample case, differences) that equal zero. However, in later surveys, he recommended removing zeros from the sample. Then the standard signed-rank test could be applied to the resulting data, as long as there were no ties. This is now called the reduced sample procedure.

    Pratt observed that the reduced sample procedure can lead to paradoxical behavior. He gives the following example. Suppose that we are in the one-sample situation and have the following thirteen observations:

    0, 2, 3, 4, 6, 7, 8, 9, 11, 14, 15, 17, −18.

    The reduced sample procedure removes the zero. To the remaining data, it assigns the signed ranks:

    1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, −12.

    This has a one-sided p-value of , and therefore the sample is not significantly positive at any significance level . Pratt argues that one would expect that decreasing the observations should certainly not make the data appear more positive. However, if the zero observation is decreased by an amount less than 2, or if all observations are decreased by an amount less than 1, then the signed ranks become:

    −1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, −13.

    This has a one-sided p-value of . Therefore the sample would be judged significantly positive at any significance level . The paradox is that, if is between and , then decreasing an insignificant sample causes it to appear significantly positive.

    Pratt therefore proposed the signed-rank zero procedure. This procedure includes the zeros when ranking the observations in the sample. However, it excludes them from the test statistic, or equivalently it defines . Pratt proved that the signed-rank zero procedure has several desirable behaviors not shared by the reduced sample procedure:
    199 Increasing the observed values does not make a significantly positive sample insignificant, and it does not make an insignificant sample significantly negative.
    299 If the distribution of the observations is symmetric, then the values of which the test does not reject form an interval.
    399 A sample is significantly positive, not significant, or significantly negative, if and only if it is so when the zeros are assigned arbitrary non-zero signs, if and only if it is so when the zer…

    在 Wikipedia 上阅读更多信息

    继续阅读