Skip to content

GSoC 2017

Alexandre Almeida edited this page Aug 24, 2017 · 2 revisions

Overview

The qqplotr is a package that extends some of the ggplot2 functionalities regarding the drawing of Quantile-Quantile (Q-Q) and Probability-Probability (P-P) plots. Our package was developed in the scope of the following Google Summer of Code (GSoC) 2017 project:

Project Title: Distributional Assessments with Q-Q Plots

Organization: R Project for Statistical Computing

Authors:

The complete list of my (Alexandre) commits to the qqplotr package during the project duration can be found at: https://github.com/aloy/qqplotr/commits/master?author=almeidaxan

Project Plan

Initial Version

In the initial project plan we proposed to follow a milestone-driven workflow:

  • Implement geom_qq_line, and geom_qq_band (due before 1st Evaluation).
  • Implement geom_qq_rotate, and adjust both the previous Geoms to handle the plot rotation/detrend (due before 2nd Evaluation).
  • Write the documentation for all implemented functions, create a vignette for the package, develop an interactive Shiny app showcasing the package functions and its parameters, and submit the package to CRAN (due before Final Evaluation).

In addition, we also proposed some extra stretch goals in the case I was way ahead of the proposed schedule. Two goals were proposed:

  • Parametrize the type of the quantile algorithm (used by stats::quantile) to be used by the Geoms.
  • Extend all the previous Geom functionalities to P-P plots, that is, create geom_pp_point, geom_pp_line, and geom_pp_band.

Work plan modifications

At the start of the coding period, we realized that developing Geoms was not necessary, as we could follow a simpler path and develop Stats by making use of the already implemented Geoms, such as geom_point, geom_line, and geom_ribbon. Hence, instead of geom_qq_* functions, we aimed to develop stat_qq_* functions.

In the second week of the package development, we also discussed that implementing a separate stat_qq_rotate function was not really necessary to perform the plot rotation/detrend, as such feature could be coded as an additional parameter inside the previous Stats.

We also proposed to document the functions as we were writing them, opposed to leaving that as a separate task to be done at the end of the project. In addition, since the start of the package development, we had already been shaping the repository with an R package structure. In my opinion, both those changes helped a lot in terms of code organization.

Our contributions

By following the modified work plan, we were able to achieve all the proposed goals, including the extra stretch goals.

In summary, the qqplotr package is composed of six main, fully documented functions:

  • stat_qq_point, stat_qq_point, and stat_qq_band (Q-Q plot)
  • stat_pp_point, stat_pp_point, and stat_pp_band (P-P plot)

The vignette documentation was also produced and attached to the package. That documentation gives a more in-depth overview of the package functionalities by providing several usage examples. It is presented as an HTML file, and its contents are very similar to this repository's README file.

Different from what was proposed, we could not integrate the Shiny app into the vignette due to some coding limitations. Thus, we implemented the app as a stand-alone version which may be simply run by using qqplotr::runShinyExample().

As of July 27th, 2017, the qqplotr package version 0.0.1 is available on CRAN. The CRAN version still lacks some important changes that were made into the GitHub development version afterwards. Hence, in the next following weeks, we are intending to upload an updated version of the package to CRAN with all the recent package modifications.

This package is also part of the official ggplot2 extensions, which is (of course) a webpage showcasing some ggplot2 extensions created by the community.

Finally, we are also preparing the writing of a paper to be submitted on The R Journal. This paper will give a comprehensive walkthrough of the package functions and will also exemplify its potential usage on real world data.

Clone this wiki locally