NEWS
inspectdf 0.0.13
- Fixed compatibility with
dplyr >= 1.1.0 by replacing deprecated
functions: select_if() replaced with select(where()), and
mutate_if() replaced with mutate(across(where())).
- Fixed critical bug in
plot_cat() where bind_rows(.id = ) with
unnamed lists caused failures in newer dplyr versions. Function now
properly assigns column names to list elements.
- Fixed issue in
plot_cat() where filtering by non-existent jsd
column removed all rows when plotting single dataframe summaries.
- Fixed
#48,
inspect_num(df1, df2) with different ranges and different set of
columns. Thanks to cregouby for the
#51 fix.
- Fixed
#45,
partial argument matching warning in
format_size(). Changed
unit = "auto" to units = "auto" in call to format(). Thanks to
@salim-b for the report.
- Updated CRAN checks badge URL from deprecated
cranchecks.info/badges/ to new badges.cranchecks.info/ service.
- Fixed
ggplot2 deprecation warning by replacing size parameter
with linewidth in geom_bar().
inspectdf 0.0.11 (2021-04-02)
- Bug fixes to
inspect_types() for pairwise comparison plots
- Updated tests for
inspect_types() for pairwise comparisons
inspectdf 0.0.10 (2021-02-20)
- Add
include_int option in inspect_cat() to allow treatment of
integer columns as categorical.
- Improved p-values associated with binned categorical and numeric
comparisons. This is now based on a modified chi-squared test and is
labelled as
pval in the resulting output.
- Fixed #27
ensuring plots for
inspect_cat() respect any filtering or sorting of
the summary output prior to show_plot(). Thanks to Roel
Verbelen for the report.
- Additional detail in
inspect_type() comparison of two dataframes to
make it easier to see which columns and types differ.
inspectdf 0.0.9 (2020-09-07)
- Minor change, ensuring all functions use
return properly.
inspectdf 0.0.8 (2020-06-25)
- Important change: the
show_plot argument has been removed from
all inspect_*() functions. To generate visualisations of data frame
summaries, please use the more flexible show_plot(inspect_*()) or
via the pipe inspect_*() %>% show_plot().
show_plot() improvement that nudges points that might otherwise have
coincided for dataframe comparisons of imbalance (for example, with
inspect_imb(df1, df2) %>% show_plot())
- Plots for grouped summaries:
inspect_cor(), inspect_na() and
inspect_num().
inspect_cor() slight speed up for dataframes with large numbers of
columns.
inspect_cor() can be filtered prior to plotting, for example
inspect_cor(starwars) %>% filter(abs(corr) > 0.2) %>% show_plot().
Thanks to Roel Verbelen for the
suggestion
- Fixed bug causing
inspect_imb() to fail on certain types of factor
columns. Thanks to Roel Verbelen
for the
report.
show_plot() has new arguments label_size, label_angle and
label_color. Each provide adjustments to text annotation where
applicable. Thanks to Bartosz Bursa
for the
suggestion.
- changes to text annotation to improve how
coord_flip() works on
resulting plots. Thanks to Roel
Verbelen for the report.
inspectdf 0.0.7 (2019-11-05)
- Added
bytes column to inspect_mem() output, for downstream numeric
comparison and consistency with inspectpd.
- Added
pcnt_nna column to inspect_cor() output containing the
percentage of pairwise complete observations used calculated
correlations. Thanks to Theo Broekman for the suggestion.
- Fixed bug causing order of grouping variable in grouped
inspect_
statements to be incorrect. Thanks to the report from Theo Broekman.
- Removed erroneous print statement from
inspect_num().
inspectdf 0.0.6 (2019-09-29)
- Updates to documentation throughout.
inspect_* functions now returns results by group grouped dataframes.
- Added option for
inspect_num() %>% show_plot() to show histograms
with color palettes specified by the col_palette argument.
- Fixed bug causing
inspect_imb() to sometimes fail when factors
present. Thanks to Doug Friedman
for the
report.
inspectdf 0.0.5 (2019-08-26)
- Fixed error causing
inspect_num() to fail when columns contained all
NA values. Thanks to Ryan Tanner
for the
report
- Speed-up of
inspect_cor() for large data frames with many numeric
columns.
- Added approximate confidence intervals and tests for
method = 'kendall' and method = 'spearman' in inspect_cor().
inspectdf 0.0.4 (2019-07-27)
- Fix issue causing
inspect_na() %>% show_plot() to fail when 0 NA
present. Thanks to the
report by
Metin Yazici.
show_plot() now returns a ggplot2 object rather than printing the
plot - thanks to Garrick Aden-Buie for
the
suggestion.
- Dramatic speed up of
inspect_cat plotting by avoiding text labels
for small regions.
- Added
tech dataset.
- Fix for text annotation of
inspect_cat() plots when labels are empty
strings. By default "" will be shown. Thanks to Michael
Swenson for the
report
inspect_cor(method = ...) argument added, thanks to suggestion from
George Dontas. Options for pearson,
spearman and kendall. Note that confidence intervals and tests
currently only supported for pearson.
- Fix error when duplicate factor labels present in
inspect_cat() &
inspect_imb().
inspectdf 0.0.3 (2019-06-27)
text_labels autoscale size using ggfittext::geom_fit_text(). For
an example see
inspect_cat().
Thanks to David Wilkins for the
PR.
- 6 different color palettes supported in
show_plot() via
col_palette argument. Colorblind friendly option specified via
show_plot(col_palette = 1) - thanks to Richard
Careaga for the
suggestion.
inspect_imb().
include_na option for categorical columns that are 100% missing,
or constant are underlined in plot for easier comprehension.
inspect_cor()
- Points and whiskers changed to coloured bands for single dataframe
summaries - these are easier to see when CIs are narrow.
- Points changed to bars for
inspect_cor() comparison plots - makes
it easier to see smaller differences in correlations.
NA correlations omitted from inspect_cor() comparison when
plotted. Ordering of correlations reversed to be consistent with
returned tibble.
inspectdf 0.0.2 (2019-05-23)
show_plot() function (show_plot argument in inspect_ functions
will be dropped in a future version)
high_cardinality argument in show_plot() for combining unique or
near-unique categories for plotting inspect_cat().
progress bars shown when processing larger datasets
- Improvements to plots throughout
inspectdf 0.0.1 (2019-04-24)