Skip to contents

plotsdatabase 1.5

Breaking Changes

  • query_plots() now returns a list by default instead of a flat data frame
    • Output is automatically structured based on inventory method using the new output styles system
    • Different styles organize data into separate tables: metadata, individuals, censuses, height-diameter, etc.
    • Action required: To preserve old behavior (flat data frame), use output_style = "full"
    • Rationale: Structured output makes it easier to work with complex plot data without overwhelming column counts
    • See documentation for ?query_plots for details on available output styles

New Features

  • Configurable output styles system for query_plots()
    • 6 predefined output styles: minimal, standard, permanent_plot, permanent_plot_multi_census, transect, full
    • Auto-detection of appropriate style based on method field (e.g., “1 ha plot” → permanent_plot)
    • Manual style selection via output_style parameter
    • Each style returns a structured list with relevant tables (e.g., $metadata, $individuals, $censuses)
    • Column renaming from database names to user-friendly names (e.g., ddlatlatitude, tax_sp_levelspecies)
    • New configuration files: R/output_styles_config.R, R/output_styles_helpers.R
  • Specialized output tables for permanent plots
    • $censuses table: plot_name, census_number, census_date, team_leader, principal_investigator
    • $height_diameter table: Paired height-diameter measurements (id_n, D, H, POM) with issue filtering
    • Handles multiple censuses with automatic pivoting from wide to long format
    • Census-specific column renaming (e.g., stem_diameter_census_1dbh_census_1)
  • Custom print method for query results
    • New S3 class plot_query_list with informative print method
    • Shows table dimensions, column names, and geometry type for sf objects
    • Makes it easy to understand query result structure
  • Preservation of spatial data
    • coordinates_sf table automatically included when show_all_coordinates = TRUE
    • Print method detects and displays sf geometry information

Code Refactoring

  • Modular output style configuration
    • Centralized style definitions in .plot_output_styles list
    • Method-to-style mapping in .method_to_style_map
    • Style auto-detection function .detect_style_from_method()
    • Easy to add new output styles by extending configuration
  • Improved metadata extraction
    • Uses res_meta_data table (created before individual extraction) for metadata source
    • Ensures all plot-level columns available even when extract_individuals = TRUE
    • Consistent variable naming and error handling

Bug Fixes

  • Fixed commented @export tag causing roxygen2 errors
    • Removed @export from commented-out subplot_list() function in R/subsplots_features_function.R
    • Prevents documentation build failures

plotsdatabase 1.4 (development version)

New Features

  • Traits enrichment module in taxonomic matching Shiny app
    • New tab “Enrich with Traits” allows enriching matched taxonomic names with trait data from the taxa database
    • Aggregates multiple input names that match to the same taxon into a single row
    • Concatenates all input names (e.g., “cola edulis | coula edrulis” → “Coula edulis”)
    • Configurable options for categorical trait aggregation (mode vs concatenation)
    • User can select which columns to include (original names, corrected names, IDs, metadata)
    • Downloads enriched data as Excel file
    • Filters out id_trait_measures columns for cleaner output
    • Module: mod_traits_enrichment_ui() and mod_traits_enrichment_server()
  • Enhanced file upload in taxonomic matching Shiny app
    • CSV file support added (in addition to Excel .xlsx and .xls)
    • Excel sheet selector allows choosing which sheet to import from multi-sheet workbooks
    • Sheet selector appears dynamically after Excel file upload
    • Default sheet selection is the first sheet
    • CSV files are loaded directly without sheet selection

Bug Fixes

  • Fixed NA input names appearing in trait enrichment
    • Enrichment module now filters out rows where the input taxonomic name is NA or empty
    • Prevents invalid NA entries from being matched to taxa or included in enriched output
    • Applied in both trait fetching and result aggregation steps
  • Fixed incorrect input names in enrichment output
    • Enrichment now correctly uses the user-selected taxonomic name column (not first column of dataset)
    • column_name parameter now passed from main app to enrichment module
    • Ensures input_names column shows actual taxonomic names from the selected column

Code Refactoring

  • Optimized taxonomic name cleaning for faster matching
    • Name cleaning (removing “sp.”, “cf.”, “aff.”, etc.) now happens before batch exact matching
    • Previously, cleaning only occurred during slow fuzzy matching phase
    • Names like “Coula edulis sp.” now match exactly to “Coula edulis” in fast batch step
    • Significantly reduces number of names sent to slower fuzzy matching
    • Cleaning happens once at beginning, benefiting all matching steps (species, genus, family)
    • Both original and cleaned names preserved in matching pipeline
    • Added underscore replacement in clean_taxonomic_name() (e.g., “Coula_edulis” → “Coula edulis”)

Breaking Changes

  • query_taxa() default behavior changed: exact_match parameter now defaults to TRUE (was FALSE)
    • Exact matching is now the default for family/genus/order queries to prevent unexpected fuzzy matching results
    • For species queries, if exact match fails, the function automatically falls back to intelligent fuzzy matching
    • Action required: Code relying on fuzzy matching by default should explicitly set exact_match = FALSE
    • Rationale: Higher taxonomic ranks are standardized names where fuzzy matching rarely helps and can introduce errors

New Features

  • Intelligent taxonomic name matching with genus-constrained fuzzy search
  • Auto fuzzy fallback for species queries
    • query_taxa() automatically retries with fuzzy matching when exact species match fails
    • Transparent user feedback shows match quality (similarity score)
    • Handles typos and spelling variations automatically
    • Only applies to species queries; family/genus/order use exact matching only
  • Database enhancement: tax_level field added to table_taxa
    • New column explicitly indicates taxonomic level: “species”, “genus”, “family”, “order”, “infraspecific”, “higher”
    • Indexed for query performance
    • Eliminates ambiguity between missing data and genus/family-level taxa
    • Script provided: add_tax_level_field.R for database migration
    • All query functions updated to use new field for cleaner, more reliable filtering

Code Refactoring

  • Complete rewrite of query_taxa() to use new intelligent matching functions
    • Eliminated redundancy with helpers.R functions
    • 8 new modular helper functions replace complex inline logic
    • Cleaner separation of concerns: matching, filtering, synonym resolution, formatting, trait addition
    • ~160 lines of code removed through better abstraction
    • Better maintainability and extensibility
    • Deprecated query_fuzzy_match() and query_exact_match() in favor of match_taxonomic_names()
  • Simplified taxonomic level filtering using tax_level field
    • Replaced complex multi-column checks (e.g., is.na(tax_esp) & is.na(tax_gen)) with simple tax_level == "family"
    • Applied in query_taxa() for clearer intent and better performance via index usage

Bug Fixes

  • Fixed query_taxa() empty results with only_family = TRUE
    • Previously, fuzzy matching by default caused empty results when filtering for family-level taxa
    • Now uses exact matching by default for higher taxonomic ranks

Dependencies

  • Added new package dependencies to DESCRIPTION:
    • cli - User-friendly command line interfaces (moved from Suggests to Imports)
    • lifecycle - Manage function lifecycle (deprecation warnings)
    • data.table - High-performance data manipulation
    • glue - String interpolation for SQL queries
    • RecordLinkage - String similarity calculations

plotsdatabase 1.0

Breaking Changes

  • Database schema change: Renamed column ind_num_sous_plot to tag in data_individuals and followup_updates_individuals tables
    • All R package functions updated to use new column name
    • Action required: External scripts accessing ind_num_sous_plot must be updated to use tag
    • Updated files: R/functions_manip_db.R, R/individual_features_function.R, R/functions_divid_plot.R, R/generate_plot_summary.Rmd, structure.yml
    • Default parameter in approximate_isolated_xy() changed from tag = "ind_num_sous_plot" to tag = "tag"

New Features

  • Initial release of package structure with comprehensive database query functions
  • Enhanced update_ident_specimens(): Now shows summary of linked individuals before updating specimen identification
    • Displays which plots and how many individuals will inherit the new identification
    • Shows current taxonomic identification of linked individuals
    • Provides better context for informed decision-making before confirmation
    • New helper function .get_linked_individuals_summary() queries and summarizes impact

Bug Fixes

  • Connection error with complex home paths: Fixed create_db_config() function that failed when home directory path contained spaces or special characters (e.g., OneDrive paths like C:/Users/NOBUS CAPITAL/OneDrive/Documents/)
    • Added proper error handling with tryCatch() for file creation
    • Creates parent directories if they don’t exist
    • Falls back to in-memory configuration if file cannot be written
    • Users now get informative warnings instead of connection failures

Documentation

  • Added comprehensive README.md with package overview, quick start guide, and function reference
  • README includes prominent link to NEWS.md for tracking updates

Infrastructure

  • Added NEWS.md to track package changes and updates
  • Established git branching workflow for all code modifications

Code Refactoring

  • Major refactoring: Reorganized R/functions_manip_db.R (previously 10,528 lines) into modular, domain-specific files
    • Created R/growth_census_functions.R (556 lines) - Growth computation and census analysis functions
    • Created R/specimen_linking_functions.R (406 lines) - Herbarium specimen linking and querying functions
    • Created R/taxonomic_query_functions.R (944 lines) - Taxonomic query functions with synonym resolution
    • Created R/taxonomic_update_functions.R (838 lines) - Taxonomic data update and entry functions
    • Expanded R/connections_db.R with database query utilities (func_try_fetch, try_open_postgres_table)
    • Removed ~6,800 lines from R/functions_manip_db.R through extraction to specialized modules
    • All functions verified as moved (not duplicated) to new locations
    • Improved code maintainability and discoverability