Metrics¶
The Argenta metrics system provides tools for measuring the performance of key library components. This allows tracking performance regression/progression between releases and optimizing critical code sections.
Running Metrics¶
To work with metrics, you need to clone the repository and install dependencies:
git clone https://github.com/koloideal/Argenta.git
cd Argenta
uv sync --group metrics
Running the metrics system:
python -m metrics
After launch, an interactive session will open with available commands for working with benchmarks.
Available Commands¶
run-all¶
Runs all registered benchmarks and outputs results as tables.
Syntax:
run-all [--without-gc] [--without-system-info]
Flags:
--without-gc— disables garbage collector during benchmark execution for more stable results--without-system-info— hides system information in output
list-types¶
Displays a list of all available benchmark types with the number of tests in each category.
Syntax:
list-types
Example output:
Available benchmark types:
• flag_validation (9 benchmarks)
• input_command_parse (7 benchmarks)
• finds_appropriate_handler (5 benchmarks)
run-type¶
Runs benchmarks of a specific type.
Syntax:
run-type --type <type_name> [--without-gc] [--without-system-info]
Flags:
--type— benchmark type to run (required)--without-gc— disables garbage collector--without-system-info— hides system information
diagrams-generate¶
Generates visual performance comparison diagrams for all benchmarks.
Syntax:
diagrams-generate [--iterations <number>] [--without-gc]
Flags:
--iterations— number of iterations for each benchmark (default 100)--without-gc— disables garbage collector
Diagrams are saved to the metrics/reports/diagrams/<timestamp>/ directory.
release-generate¶
Generates a complete performance report for the current library version. Used when preparing releases.
Syntax:
release-generate
The command automatically:
Determines the current library version
Runs all benchmarks with 1000 iterations and disabled GC
Generates JSON reports and comparison diagrams
Saves results to
metrics/reports/releases/<version>/
Interpreting Results¶
Benchmark results include the following metrics:
- Mean time (mean)
Average operation execution time. The primary metric for performance comparison.
- Median (median)
Median execution time value. Less sensitive to outliers than the mean.
- Standard deviation (std)
Shows measurement stability. A lower value means more predictable performance.
Usage Recommendations¶
- For optimization
Use
run-typeto focus on a specific area and--without-gcfor more accurate measurements.- For visualization
The
diagrams-generatecommand creates clear charts suitable for presentations and documentation.- For stable results
Close resource-intensive applications, use the
--without-gcflag, and increase the number of iterations via--iterations.
Adding New Benchmarks¶
You can implement your own benchmarks to test specific library units. New benchmarks are added via the @benchmarks.register decorator:
1from metrics.benchmarks.entity import benchmarks
2
3@benchmarks.register(
4 type_="my_category",
5 description="Description of what is being measured"
6)
7def benchmark_my_operation() -> None:
8 # Code whose performance is being measured
9 pass
Important
The benchmark must be imported in metrics/benchmarks/__init__.py for automatic registration.