AUTHORS: Ahmed Alsayed, Giancarlo Manzi
Download as PDF
ABSTRACT: This paper aims at examining the performance of a recently proposed measure of dependence – the Monotonic Dependence Coefficient – MDC - with respect to classical monotonic correlation measures like Pearson’s r, Spearman’s ߩ ,and Kendall’s τ, using simulated outlier contaminated and non-contaminated data sets as well as a contaminated real dataset, considering three different cases. This comparison aims at checking how and when these coefficients detect dependence relationships between two variables when outliers are present. Several scenarios are created with multiple values for the dependence measures, outlier contamination fractions and data patterns. The basic simulated dataset is generated from a bivariate standard normal distribution. Using values generated from the exponential, power-transformed, lognormal, and Weibull distributions, added to the basic generated dataset, we transform the contaminated data, allowing for multiple patterns. The main findings tend to favour the Spearman’s ߩ coefficient for most of the simulated scenarios, especially when the outlier contamination is taken into account, whereas MDC performs better than ߩ in noncontaminated data. However, in the real data scenario Spearman’s ߩ outperforms the other measures in two out of three cases, whereas MDC performs better in the other case.
KEYWORDS: Outliers; Correlation Coefficient; Monotonic Dependence; Monte Carlo Simulation; Environmental Quality; Economic Growth.
REFERENCES:
[
1] Kendall MG. The Advanced Theory of Statistics,
vol. 1, fourth ed. London: Charles Griffin &
Company, 1948.
[2] Fredricks GA, Nelsen RB. On the relationship
between Spearman’s rho and Kendall’s tau for pairs
of continuous random variables. J Stat Plan
Inference. 2007; 137: 2143-2150.
[3] Ferrari PA, Raffinetti E. A Different Approach
to Dependence Analysis. Multivar Behav Res, 2015;
50(2): 248-264.
[4] Raffinetti, E, Ferrari, PA. New Perspectives for
the MDC Index in Social Research Fields. In
Morlini, I., Minerva, T., Vichi, M. (Eds.): Advances
in Statistical Models for Data Analysis, Zurich,
Switzerland: Springer Verlag: 211-219, 2015.
[5] Bishara AJ, Hittner JB. Reducing bias and error
in the correlation coefficient due to
nonnormality. Educ Psychol Meas, 2015; 75(5):
785-804.
[6] Bliss CI. Statistics in Biology. New York (NY):
McGraw-Hill; 1967.
[7] Rousseeuw PJ, Leroy AM. Robust Regression
and Outlier Detection. New York (NY): John Wiley
& Sons; 1987.
[8] Abdullah MB. On a Robust Correlation
Coefficient. The Statistician, 1990; 39: 455-460.
[9] Osborne JW, Overbay A. The Power of Outliers
(and Why Researchers Should Always Check
Them). Practical Assessment, Research and
Evaluation, 2004; 9(6): 1-8.
[10] Barnett V, Lewis T. Outliers in statistical data.
3rd edition, 1994, Chichester (UK): John Wiley &
Sons.
[11] Grubbs FE. Procedures for detecting outlying
observations in samples. Technometrics, 1969, 11:1
- 21.
[12] Iglewicz B, Hoaglin D. How to detect and
handle outliers. 1993, Milwaukee (WI): ASQC
Quality Press.
[13] Vale, C., & Maurelli, V. (1983). Simulating
multivariate non-normal distributions.
Psychometrika, 48(3), 465–471.
[14] Al Sayed, A. R., Isa, Z., & Kun, S. S. (2018).
Outliers Detection Methods in Panel Data
Regression: An Application to Environment
Science. International Journal of Ecological
Economics & Statistics, 39(1), 73-86.
[15] Al Sayed, A. R., & Sek, S. K. (2013).
Environmental Kuznets curve: evidences from
developed and developing economies. Applied
Mathematical Sciences, 7(22), 1081–1092.
[16] Rousseeuw, P. J., and B. C. van Zomeren.
1990. Unmasking multivariate outliers and leverage
points. Journal of the American Statistical
Association 85: 633–639.