-
Notifications
You must be signed in to change notification settings - Fork 9
GSoC 2017
The qqplotr is a package that extends some of the ggplot2 functionalities regarding the drawing of Quantile-Quantile (Q-Q) and Probability-Probability (P-P) plots. Our package was developed in the scope of the following Google Summer of Code (GSoC) 2017 project:
Project Title: Distributional Assessments with Q-Q Plots
Organization: R Project for Statistical Computing
Authors:
- Adam Loy (Mentor)
- Heike Hofmann (Mentor)
- Alexandre Almeida (Developer)
The complete list of my (Alexandre) commits to the qqplotr package during the project duration can be found at: https://github.com/aloy/qqplotr/commits/master?author=almeidaxan
In the initial project plan we proposed to follow a milestone-driven workflow:
- Implement
geom_qq_line, andgeom_qq_band(due before 1st Evaluation). - Implement
geom_qq_rotate, and adjust both the previous Geoms to handle the plot rotation/detrend (due before 2nd Evaluation). - Write the documentation for all implemented functions, create a vignette for the package, develop an interactive Shiny app showcasing the package functions and its parameters, and submit the package to CRAN (due before Final Evaluation).
In addition, we also proposed some extra stretch goals in the case I was way ahead of the proposed schedule. Two goals were proposed:
- Parametrize the type of the quantile algorithm (used by
stats::quantile) to be used by the Geoms. - Extend all the previous Geom functionalities to P-P plots, that is, create
geom_pp_point,geom_pp_line, andgeom_pp_band.
At the start of the coding period, we realized that developing Geoms was not necessary, as we could follow a simpler path and develop Stats by making use of the already implemented Geoms, such as geom_point, geom_line, and geom_ribbon. Hence, instead of geom_qq_* functions, we aimed to develop stat_qq_* functions.
In the second week of the package development, we also discussed that implementing a separate stat_qq_rotate function was not really necessary to perform the plot rotation/detrend, as such feature could be coded as an additional parameter inside the previous Stats.
We also proposed to document the functions as we were writing them, opposed to leaving that as a separate task to be done at the end of the project. In addition, since the start of the package development, we had already been shaping the repository with an R package structure. In my opinion, both those changes helped a lot in terms of code organization.
By following the modified work plan, we were able to achieve all the proposed goals, including the extra stretch goals.
In summary, the qqplotr package is composed of six main, fully documented functions:
-
stat_qq_point,stat_qq_point, andstat_qq_band(Q-Q plot) -
stat_pp_point,stat_pp_point, andstat_pp_band(P-P plot)
The vignette documentation was also produced and attached to the package. That documentation gives a more in-depth overview of the package functionalities by providing several usage examples. It is presented as an HTML file, and its contents are very similar to this repository's README file.
Different from what was proposed, we could not integrate the Shiny app into the vignette due to some coding limitations. Thus, we implemented the app as a stand-alone version which may be simply run by using qqplotr::runShinyExample().
As of July 27th, 2017, the qqplotr package version 0.0.1 is available on CRAN. The CRAN version still lacks some important changes that were made into the GitHub development version afterwards. Hence, in the next following weeks, we are intending to upload an updated version of the package to CRAN with all the recent package modifications.
This package is also part of the official ggplot2 extensions, which is (of course) a webpage showcasing some ggplot2 extensions created by the community.
Finally, we are also preparing the writing of a paper to be submitted on The R Journal. This paper will give a comprehensive walkthrough of the package functions and will also exemplify its potential usage on real world data.