# Visualizing Custom Column Distributions Displays the **percentage distribution of any categorical attribute** added through the `add_custom_column()` method. This visualization allows quick inspection of how auxiliary attributes such as *department*, *clinic type*, or *care pathway* are distributed across the simulated patient registry. --- ## Function Overview **Function:** `medscheduler.utils.plotting.plot_custom_column_distribution(df, column, ...)` **Inputs:** - `df (pd.DataFrame)` — DataFrame containing the custom categorical column (usually `patients_df` or `appointments_df`). - `column (str)` — Name of the custom column to visualize. - `title_prefix (str, optional)` — Prefix for the chart title (default: `"Percentage of Patients by"`). - `top_n (int, optional)` — Limits display to the top *N* categories (useful for long-tail distributions). - `base_height_per_category (float, optional)` — Vertical space allocated per category (default: `0.7`). - `min_height`, `max_height (float, optional)` — Bounds for automatic figure height adjustment. **Returns:** `matplotlib.axes.Axes` — Horizontal bar chart representing the percentage of records per category. **Validation & error handling:** - Missing column → `_empty_plot("Column '' not found in DataFrame.")` - Empty or invalid column → `_empty_plot("No valid data found in column ''.")` --- ## Output Description - **Y-axis:** Category labels displayed **top to bottom**, preserving their natural or defined order. - **X-axis:** Percentage of total records (`%`). - **Bars:** Represent each category’s share in the dataset. - **Dynamic height:** The plot automatically scales to fit the number of categories. - **Labels:** Each bar shows its percentage value (`xx.x%`). - **Color scheme:** Primary Medscheduler blue (`#67A7D4`). - **Style:** Horizontal bar layout with dashed X-grid, no top/right spines, and a left-aligned descriptive title. - **Order preservation:** If the column is of type `Categorical(ordered=True)`, the bars appear in the exact order of category definition (e.g., `"Primary" → "Specialist" → "Diagnostics" → "Follow-up"`). This plot is ideal for examining **auxiliary attributes** such as departmental allocation, referral source, or care type, helping verify that custom sampling distributions (e.g., *uniform*, *pareto*, *normal*) behave as expected. --- ## Example ```python from medscheduler import AppointmentScheduler from medscheduler.utils.plotting import plot_custom_column_distribution # Generate synthetic dataset sched = AppointmentScheduler(seed=99) sched.generate() # Add a custom categorical column representing clinic departments sched.add_custom_column( column_name="department", categories=["Primary Care", "Cardiology", "Orthopedics", "Radiology", "Rehabilitation"], distribution_type="uniform" ) # Plot the percentage of appointments by department ax = plot_custom_column_distribution(sched.patients_df, column="department") ax.figure.show() # optional when running interactively ``` **Output preview:** The following chart displays the relative frequency of appointments across clinical departments. Each bar corresponds to one department, maintaining the configured order and illustrating uniform allocation among patient records. ![Custom Column Distribution](../_static/visuals/visualization/plot_custom_column_distribution.png) --- ## Next Steps - Review how custom attributes integrate into {doc}`../api-reference/patients_table`. - Learn about random sampling and distribution shapes: {doc}`../api-reference/randomness_and_noise`. - Combine with demographic and flow parameters for advanced simulations: {doc}`../examples/custom_columns_examples`. - Explore other visualization utilities in {doc}`../visualization/index`.