Runtime benchmarking & prediction
Benchmarks measuring plot / table runtime across dataset sizes and predictor mixes, plus a packaged regression model and lightweight wrapper predict_process_time() that returns an estimated runtime (ms) with uncertainty bounds. Benchmark outputs and the fitted model are stored as .rds assets for package use (#94)
Confirm before long runs
double_check_confint_fast_estimate() prompts users before running slower confidence-interval computations when the predictored runtime exceeds the default threshold (60s). Reduces unexpected long operations in interactive sessions (#94)
Background processing with spinner
Long-running work now runs in background R processes (callr::r_bg()), while the main process shows a {cli} spinner. Helpers include get_summary_table_with_spinner(), check_assumptions_with_spinner(), and user_spinner(). Spinners are suppressed in tests / CI (#96) (#97)
Influential-observation diagnostics
New assumption_no_extreme_values() detects influential observations using Cook's distance, leverage and standardised residuals with adaptive thresholds and conservative flagging. Test-data generators get_df_influential() and get_lr_influential() added (#98)
Reproducible README figures
Script tools/generate_readme_figures.R, refreshed figures assets in man/figures and added Suggests dependencies for their reproduction (here, webshot2) so README builds are reproducible (#100)
Runtime prediction integrated into UI
plot_or() and table_or() call predict_process_time() and prompt via the double-check helper when long runs are predicted. Benchmark code moved to data-raw/; serialised assets are shipped for app use (no re-benchmarking at runtime) (#94)
Safe background wrappers
Wrapper functions run heavy work in background processes and return identical results while surfacing errors / warnings. Spinner display is controlled by use_spinner() (#96)
Namespace and tidy-eval hardening
Explicit qualifications (utils::stack(), stats::predict()), use of .data$ pronoun and improved Roxygen for use_spinner() reduce R CMD check notes and improve compatibility with dplyr v1.2+. callr added to Imports (#97)
Stable transformations
Replaced log1p() with asinh() for continuous-predictor transforms to handle negative values and improve numerical stability (#98)
README rewrite
README content simplified and focused; figures and generation workflow updated for reproducibility (#100)
Separation detection
Fixed tidy-eval bugs, zero-count detection, empty-summary guards and NA-to_TRUE/FALSE issues in assumption_no_separation_fast() so separation is reliably detected (#96)
Spinner / test stability
Spinners are disabled in non-interactive / test environments; minor test-data tweaks improve CI reliability (#97)
Assumption checks robustness
Improved numerical stability, error handling and reporting across assumption checks to avoid spurious results (#96)
R CMD check issues
Resolved undefined global variables and unqualified generic calls to clean up check output (#97)
callr is now required (imported) for background processing (#97)
Interactive users will see a spinner during long operations; spinners are suppressed in CI / tests (#96)
No breaking API changes expected; function signatures remain compatible
Support for ordered-factor predictors
Formula parsing and model-summary utilities now recognise and correctly handle ordered factors (#91)
New helper
make_ordered_factors_compatible_with_broom() aligns term names and labels with broom::tidy() (#91)
Tests and examples
Tests, example data and models covering ordered-factor predictors, including a truncated example that verifies the sample-size assumption warning (#91)
Revert to gtExtras for CI mini-plots
Replaced hard-coded copies with gtExtras code for confidence-interval plots produced by table_or() and require gtExtras >= 0.6.1 to reduce duplication and use upstream fixes (#90)
Bump ggplot2 requirement
Update dependency to ggplot2 >= 4.0.0 (#90)
Refined predictor-type detection
Refactored to return more detailed classes (e.g. "ordered factor", "factor", "numeric") for improved downstream handling and displays (#91)
Version bump
Bumped package version to 0.9.0 to support a GitHub release (#93)
Dependency updates required
Users must have gtExtras >= 0.6.1 and ggplot2 >= 4.0.0 installed; older versions may cause plot failures (#90)
No external API changes
Plotting and table functions should behave as before while preserving ordered-factor level ordering. (#90)(#91)
Univariable Odds Ratios
Added functionality to produce univariable odds ratios alongside multivariate model odds ratios in plot_or() and table_or(). This allows for direct comparison between predictors in isolation and as part of a full model.
(#59)
Assumption Checks Control
Both plot_or() and table_or() now include a parameter to enable or disable assumption checks, giving experienced analysts more control over feedback from these functions.
(#75)
Check for Linearity
Continuous predictors are now checked for linearity in relation to the log-odds of the outcome, improving model diagnostics.
(#21)
Privacy Options
A new option in table_or() allows suppression of low counts (below a user-defined threshold) and rounding of other counts, to help protect sensitive data.
(#58)
Contextual Guidance for Model Assumptions
Now, when assumption checks fail in plot_or() or table_or(), users are prompted to run check_or() for detailed diagnostics (unless it has just been used).
(#76)
Performance of Confidence Interval Calculation
Improved processing speed of the confint_fast_estimate parameter for large datasets by optimizing the assumption check functions.
(#57)
Sample Size Check with Complete Separation
Resolved errors in assumption_sample_size() when the model exhibits complete separation by better handling of NA values in outcome counts.
(#73)
Documentation Correction
Documentation for anonymise_count_values() has been corrected: this internal function is no longer exported in the help files.
(#84)
Test suite (#68)
Addressed issues where snapshot tests failed depending on the installed
version of ggplot2. Snapshots produced by different ggplot2 versions were
causing test failures, especially with vdiffr::expect_doppelganger().
The temporary solution was to suspend these visual comparison tests to avoid unnecessary failures for users not on the latest ggplot2.
Readiness for upcoming {ggplot2} release (#65)
Investigated and resolved failures with the upcoming major release of ggplot2, ensuring that the package’s examples, vignettes, and tests remain compatible.
New function check_or() (#62)
Added an exported function, check_or(), to provide users with detailed
feedback on whether their logistic regression models meet underlying
assumptions. Previously, detailed diagnostics were only accessible via
undocumented internal functions.
Assumptions: check for sample size (#41)
Introduced a check for sufficient sample size, further improving diagnostics for logistic regression models.
These improvements make the package more robust to upstream changes in dependencies and offer users more transparent and accessible model validation tools.
Summary OR tables (#28)
Introduced summary tables for odds ratios, making it easier to view and interpret results from your model.
Faster estimates of confidence intervals (#53)
Optional argument, confint_fast_estimate, for both plot_or() and
table_or() that allows for faster approximation of confidence intervals
using stats::confint.default(). This can be helpful for large data sets
where confidence intervals can take a long time to calculate for.
Improved validation of the confidence level (#29)
Enhanced how the package checks user input for confidence levels, reducing the risk of invalid values being used.
This included enhanced checks in the internal function
validate_conf_level_inputs() with enhanced error handling and user feedback
(#31).
Assumption checks
Started a suite of checks that assumptions for logistic regression are upheld. Implemented in this release:
Assumptions: check outcome is binary (#42)
Added logic to confirm the outcome variable is binary, as required for odds ratio calculations.
Assumptions: check for multicollinearity (#43)
Implemented checks to detect multicollinearity among predictors, helping users identify and address issues that could affect model validity.
Assumptions: check for separation (#47)
Added checks for separation in the data, which can cause estimation problems in logistic regression.
Updated README
Improved the README documentation, making it easier for users to get started and understand the package.
Test suite (Developer focus) (#33, #37)
Added and developed a suite of tests for ensuring code reliability and maintaining quality as the package evolves.
Bug fixes
Addressed and resolved warnings related to the {tidyselect} package, leading to cleaner output and better compatibility with the tidyverse ecosystem. (#34)
Updated the way class descriptions are handled, consolidating them into single strings for consistency and clarity. (#50)
Fixed ordering of terms and levels in table_or(), so results are presented
in a logical and expected sequence. (#54, #56)
For the full details, see the changelog: https://github.com/craig-parylo/plotor/compare/v0.5.2...v0.6.0
plot_or() now respects the order of covariates in the formula when plotting
(#15).
plot_or() handles missing information to avoid {ggplot2} related warning
messages (#11).
plot_or() accepts customised confidence limits, e.g. 99%, used when
calculating the confidence intervals (#19).
plot_or() conducts checks on inputs - ensuring the {glm} model is
a logistic regression (family = 'binomial' and link = 'logit') and validates the
confidence limit to be within the range 0.001 to 0.999 (#22, #19).