416 lines
19 KiB
Plaintext
416 lines
19 KiB
Plaintext
Metadata-Version: 2.1
|
||
Name: confection
|
||
Version: 0.1.4
|
||
Summary: The sweetest config system for Python
|
||
Home-page: https://github.com/explosion/confection
|
||
Author: Explosion
|
||
Author-email: contact@explosion.ai
|
||
License: MIT
|
||
Classifier: Development Status :: 5 - Production/Stable
|
||
Classifier: Environment :: Console
|
||
Classifier: Intended Audience :: Developers
|
||
Classifier: Intended Audience :: Science/Research
|
||
Classifier: License :: OSI Approved :: MIT License
|
||
Classifier: Operating System :: POSIX :: Linux
|
||
Classifier: Operating System :: MacOS :: MacOS X
|
||
Classifier: Operating System :: Microsoft :: Windows
|
||
Classifier: Programming Language :: Python :: 3
|
||
Classifier: Programming Language :: Python :: 3.6
|
||
Classifier: Programming Language :: Python :: 3.7
|
||
Classifier: Programming Language :: Python :: 3.8
|
||
Classifier: Programming Language :: Python :: 3.9
|
||
Classifier: Programming Language :: Python :: 3.10
|
||
Classifier: Programming Language :: Python :: 3.11
|
||
Classifier: Topic :: Scientific/Engineering
|
||
Requires-Python: >=3.6
|
||
Description-Content-Type: text/markdown
|
||
License-File: LICENSE
|
||
Requires-Dist: pydantic !=1.8,!=1.8.1,<3.0.0,>=1.7.4
|
||
Requires-Dist: srsly <3.0.0,>=2.4.0
|
||
Requires-Dist: typing-extensions <4.5.0,>=3.7.4.1 ; python_version < "3.8"
|
||
|
||
<a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a>
|
||
|
||
# Confection: The sweetest config system for Python
|
||
|
||
`confection` :candy: is a lightweight library that offers a **configuration
|
||
system** letting you conveniently describe arbitrary trees of objects.
|
||
|
||
Configuration is a huge challenge for machine-learning code because you may want
|
||
to expose almost any detail of any function as a hyperparameter. The setting you
|
||
want to expose might be arbitrarily far down in your call stack, so it might
|
||
need to pass all the way through the CLI or REST API, through any number of
|
||
intermediate functions, affecting the interface of everything along the way. And
|
||
then once those settings are added, they become hard to remove later. Default
|
||
values also become hard to change without breaking backwards compatibility.
|
||
|
||
To solve this problem, `confection` offers a config system that lets you easily
|
||
describe arbitrary trees of objects. The objects can be created via function
|
||
calls you register using a simple decorator syntax. You can even version the
|
||
functions you create, allowing you to make improvements without breaking
|
||
backwards compatibility. The most similar config system we’re aware of is
|
||
[Gin](https://github.com/google/gin-config), which uses a similar syntax, and
|
||
also allows you to link the configuration system to functions in your code using
|
||
a decorator. `confection`'s config system is simpler and emphasizes a different
|
||
workflow via a subset of Gin’s functionality.
|
||
|
||
[![tests](https://github.com/explosion/confection/actions/workflows/tests.yml/badge.svg)](https://github.com/explosion/confection/actions/workflows/tests.yml)
|
||
[![Current Release Version](https://img.shields.io/github/v/release/explosion/confection.svg?style=flat-square&include_prereleases&logo=github)](https://github.com/explosion/confection/releases)
|
||
[![pypi Version](https://img.shields.io/pypi/v/confection.svg?style=flat-square&logo=pypi&logoColor=white)](https://pypi.org/project/confection/)
|
||
[![conda Version](https://img.shields.io/conda/vn/conda-forge/confection.svg?style=flat-square&logo=conda-forge&logoColor=white)](https://anaconda.org/conda-forge/confection)
|
||
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg?style=flat-square)](https://github.com/ambv/black)
|
||
|
||
## ⏳ Installation
|
||
|
||
```bash
|
||
pip install confection
|
||
```
|
||
|
||
```bash
|
||
conda install -c conda-forge confection
|
||
```
|
||
|
||
## 👩💻 Usage
|
||
|
||
The configuration system parses a `.cfg` file like
|
||
|
||
```ini
|
||
[training]
|
||
patience = 10
|
||
dropout = 0.2
|
||
use_vectors = false
|
||
|
||
[training.logging]
|
||
level = "INFO"
|
||
|
||
[nlp]
|
||
# This uses the value of training.use_vectors
|
||
use_vectors = ${training.use_vectors}
|
||
lang = "en"
|
||
```
|
||
|
||
and resolves it to a `Dict`:
|
||
|
||
```json
|
||
{
|
||
"training": {
|
||
"patience": 10,
|
||
"dropout": 0.2,
|
||
"use_vectors": false,
|
||
"logging": {
|
||
"level": "INFO"
|
||
}
|
||
},
|
||
"nlp": {
|
||
"use_vectors": false,
|
||
"lang": "en"
|
||
}
|
||
}
|
||
```
|
||
|
||
The config is divided into sections, with the section name in square brackets –
|
||
for example, `[training]`. Within the sections, config values can be assigned to
|
||
keys using `=`. Values can also be referenced from other sections using the dot
|
||
notation and placeholders indicated by the dollar sign and curly braces. For
|
||
example, `${training.use_vectors}` will receive the value of use_vectors in the
|
||
training block. This is useful for settings that are shared across components.
|
||
|
||
The config format has three main differences from Python’s built-in
|
||
`configparser`:
|
||
|
||
1. JSON-formatted values. `confection` passes all values through `json.loads` to
|
||
interpret them. You can use atomic values like strings, floats, integers or
|
||
booleans, or you can use complex objects such as lists or maps.
|
||
2. Structured sections. `confection` uses a dot notation to build nested
|
||
sections. If you have a section named `[section.subsection]`, `confection`
|
||
will parse that into a nested structure, placing subsection within section.
|
||
3. References to registry functions. If a key starts with `@`, `confection` will
|
||
interpret its value as the name of a function registry, load the function
|
||
registered for that name and pass in the rest of the block as arguments. If
|
||
type hints are available on the function, the argument values (and return
|
||
value of the function) will be validated against them. This lets you express
|
||
complex configurations, like a training pipeline where `batch_size` is
|
||
populated by a function that yields floats.
|
||
|
||
There’s no pre-defined scheme you have to follow; how you set up the top-level
|
||
sections is up to you. At the end of it, you’ll receive a dictionary with the
|
||
values that you can use in your script – whether it’s complete initialized
|
||
functions, or just basic settings.
|
||
|
||
For instance, let’s say you want to define a new optimizer. You'd define its
|
||
arguments in `config.cfg` like so:
|
||
|
||
```ini
|
||
[optimizer]
|
||
@optimizers = "my_cool_optimizer.v1"
|
||
learn_rate = 0.001
|
||
gamma = 1e-8
|
||
```
|
||
|
||
To load and parse this configuration using a `catalogue` registry (install
|
||
[`catalogue`](https://github.com/explosion/catalogue) separately):
|
||
|
||
```python
|
||
import dataclasses
|
||
from typing import Union, Iterable
|
||
import catalogue
|
||
from confection import registry, Config
|
||
|
||
# Create a new registry.
|
||
registry.optimizers = catalogue.create("confection", "optimizers", entry_points=False)
|
||
|
||
|
||
# Define a dummy optimizer class.
|
||
@dataclasses.dataclass
|
||
class MyCoolOptimizer:
|
||
learn_rate: float
|
||
gamma: float
|
||
|
||
|
||
@registry.optimizers.register("my_cool_optimizer.v1")
|
||
def make_my_optimizer(learn_rate: Union[float, Iterable[float]], gamma: float):
|
||
return MyCoolOptimizer(learn_rate, gamma)
|
||
|
||
|
||
# Load the config file from disk, resolve it and fetch the instantiated optimizer object.
|
||
config = Config().from_disk("./config.cfg")
|
||
resolved = registry.resolve(config)
|
||
optimizer = resolved["optimizer"] # MyCoolOptimizer(learn_rate=0.001, gamma=1e-08)
|
||
```
|
||
|
||
> ⚠️ Caution: Type-checkers such as `mypy` will mark adding new attributes to `registry` this way - i. e.
|
||
> `registry.new_attr = ...` - as errors. This is because a new attribute is added to the class after initialization. If
|
||
> you are using typecheckers, you can either ignore this (e. g. with `# type: ignore` for `mypy`) or use a typesafe
|
||
> alternative: instead of `registry.new_attr = ...`, use `setattr(registry, "new_attr", ...)`.
|
||
|
||
Under the hood, `confection` will look up the `"my_cool_optimizer.v1"` function
|
||
in the "optimizers" registry and then call it with the arguments `learn_rate`
|
||
and `gamma`. If the function has type annotations, it will also validate the
|
||
input. For instance, if `learn_rate` is annotated as a float and the config
|
||
defines a string, `confection` will raise an error.
|
||
|
||
The Thinc documentation offers further information on the configuration system:
|
||
|
||
- [recursive blocks](https://thinc.ai/docs/usage-config#registry-recursive)
|
||
- [defining variable positional arguments](https://thinc.ai/docs/usage-config#registries-args)
|
||
- [using interpolation](https://thinc.ai/docs/usage-config#config-interpolation)
|
||
- [using custom registries](https://thinc.ai/docs/usage-config#registries-custom)
|
||
- [advanced type annotations with Pydantic](https://thinc.ai/docs/usage-config#advanced-types)
|
||
- [using base schemas](https://thinc.ai/docs/usage-config#advanced-types-base-schema)
|
||
- [filling a configuration with defaults](https://thinc.ai/docs/usage-config#advanced-types-fill-defaults)
|
||
|
||
## 🎛 API
|
||
|
||
### <kbd>class</kbd> `Config`
|
||
|
||
This class holds the model and training
|
||
[configuration](https://thinc.ai/docs/usage-config) and can load and save the
|
||
INI-style configuration format from/to a string, file or bytes. The `Config`
|
||
class is a subclass of `dict` and uses Python’s `ConfigParser` under the hood.
|
||
|
||
#### <sup><kbd>method</kbd> `Config.__init__`</sup>
|
||
|
||
Initialize a new `Config` object with optional data.
|
||
|
||
```python
|
||
from confection import Config
|
||
config = Config({"training": {"patience": 10, "dropout": 0.2}})
|
||
```
|
||
|
||
| Argument | Type | Description |
|
||
| ----------------- | ----------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||
| `data` | `Optional[Union[Dict[str, Any], Config]]` | Optional data to initialize the config with. |
|
||
| `section_order` | `Optional[List[str]]` | Top-level section names, in order, used to sort the saved and loaded config. All other sections will be sorted alphabetically. |
|
||
| `is_interpolated` | `Optional[bool]` | Whether the config is interpolated or whether it contains variables. Read from the `data` if it’s an instance of `Config` and otherwise defaults to `True`. |
|
||
|
||
#### <sup><kbd>method</kbd> `Config.from_str`</sup>
|
||
|
||
Load the config from a string.
|
||
|
||
```python
|
||
from confection import Config
|
||
|
||
config_str = """
|
||
[training]
|
||
patience = 10
|
||
dropout = 0.2
|
||
"""
|
||
config = Config().from_str(config_str)
|
||
print(config["training"]) # {'patience': 10, 'dropout': 0.2}}
|
||
```
|
||
|
||
| Argument | Type | Description |
|
||
| ------------- | ---------------- | -------------------------------------------------------------------------------------------------------------------- |
|
||
| `text` | `str` | The string config to load. |
|
||
| `interpolate` | `bool` | Whether to interpolate variables like `${section.key}`. Defaults to `True`. |
|
||
| `overrides` | `Dict[str, Any]` | Overrides for values and sections. Keys are provided in dot notation, e.g. `"training.dropout"` mapped to the value. |
|
||
| **RETURNS** | `Config` | The loaded config. |
|
||
|
||
#### <sup><kbd>method</kbd> `Config.to_str`</sup>
|
||
|
||
Load the config from a string.
|
||
|
||
```python
|
||
from confection import Config
|
||
|
||
config = Config({"training": {"patience": 10, "dropout": 0.2}})
|
||
print(config.to_str()) # '[training]\npatience = 10\n\ndropout = 0.2'
|
||
```
|
||
|
||
| Argument | Type | Description |
|
||
| ------------- | ------ | --------------------------------------------------------------------------- |
|
||
| `interpolate` | `bool` | Whether to interpolate variables like `${section.key}`. Defaults to `True`. |
|
||
| **RETURNS** | `str` | The string config. |
|
||
|
||
#### <sup><kbd>method</kbd> `Config.to_bytes`</sup>
|
||
|
||
Serialize the config to a byte string.
|
||
|
||
```python
|
||
from confection import Config
|
||
|
||
config = Config({"training": {"patience": 10, "dropout": 0.2}})
|
||
config_bytes = config.to_bytes()
|
||
print(config_bytes) # b'[training]\npatience = 10\n\ndropout = 0.2'
|
||
```
|
||
|
||
| Argument | Type | Description |
|
||
| ------------- | ---------------- | -------------------------------------------------------------------------------------------------------------------- |
|
||
| `interpolate` | `bool` | Whether to interpolate variables like `${section.key}`. Defaults to `True`. |
|
||
| `overrides` | `Dict[str, Any]` | Overrides for values and sections. Keys are provided in dot notation, e.g. `"training.dropout"` mapped to the value. |
|
||
| **RETURNS** | `str` | The serialized config. |
|
||
|
||
#### <sup><kbd>method</kbd> `Config.from_bytes`</sup>
|
||
|
||
Load the config from a byte string.
|
||
|
||
```python
|
||
from confection import Config
|
||
|
||
config = Config({"training": {"patience": 10, "dropout": 0.2}})
|
||
config_bytes = config.to_bytes()
|
||
new_config = Config().from_bytes(config_bytes)
|
||
```
|
||
|
||
| Argument | Type | Description |
|
||
| ------------- | -------- | --------------------------------------------------------------------------- |
|
||
| `bytes_data` | `bool` | The data to load. |
|
||
| `interpolate` | `bool` | Whether to interpolate variables like `${section.key}`. Defaults to `True`. |
|
||
| **RETURNS** | `Config` | The loaded config. |
|
||
|
||
#### <sup><kbd>method</kbd> `Config.to_disk`</sup>
|
||
|
||
Serialize the config to a file.
|
||
|
||
```python
|
||
from confection import Config
|
||
|
||
config = Config({"training": {"patience": 10, "dropout": 0.2}})
|
||
config.to_disk("./config.cfg")
|
||
```
|
||
|
||
| Argument | Type | Description |
|
||
| ------------- | ------------------ | --------------------------------------------------------------------------- |
|
||
| `path` | `Union[Path, str]` | The file path. |
|
||
| `interpolate` | `bool` | Whether to interpolate variables like `${section.key}`. Defaults to `True`. |
|
||
|
||
#### <sup><kbd>method</kbd> `Config.from_disk`</sup>
|
||
|
||
Load the config from a file.
|
||
|
||
```python
|
||
from confection import Config
|
||
|
||
config = Config({"training": {"patience": 10, "dropout": 0.2}})
|
||
config.to_disk("./config.cfg")
|
||
new_config = Config().from_disk("./config.cfg")
|
||
```
|
||
|
||
| Argument | Type | Description |
|
||
| ------------- | ------------------ | -------------------------------------------------------------------------------------------------------------------- |
|
||
| `path` | `Union[Path, str]` | The file path. |
|
||
| `interpolate` | `bool` | Whether to interpolate variables like `${section.key}`. Defaults to `True`. |
|
||
| `overrides` | `Dict[str, Any]` | Overrides for values and sections. Keys are provided in dot notation, e.g. `"training.dropout"` mapped to the value. |
|
||
| **RETURNS** | `Config` | The loaded config. |
|
||
|
||
#### <sup><kbd>method</kbd> `Config.copy`</sup>
|
||
|
||
Deep-copy the config.
|
||
|
||
| Argument | Type | Description |
|
||
| ----------- | -------- | ------------------ |
|
||
| **RETURNS** | `Config` | The copied config. |
|
||
|
||
#### <sup><kbd>method</kbd> `Config.interpolate`</sup>
|
||
|
||
Interpolate variables like `${section.value}` or `${section.subsection}` and
|
||
return a copy of the config with interpolated values. Can be used if a config is
|
||
loaded with `interpolate=False`, e.g. via `Config.from_str`.
|
||
|
||
```python
|
||
from confection import Config
|
||
|
||
config_str = """
|
||
[hyper_params]
|
||
dropout = 0.2
|
||
|
||
[training]
|
||
dropout = ${hyper_params.dropout}
|
||
"""
|
||
config = Config().from_str(config_str, interpolate=False)
|
||
print(config["training"]) # {'dropout': '${hyper_params.dropout}'}}
|
||
config = config.interpolate()
|
||
print(config["training"]) # {'dropout': 0.2}}
|
||
```
|
||
|
||
| Argument | Type | Description |
|
||
| ----------- | -------- | ---------------------------------------------- |
|
||
| **RETURNS** | `Config` | A copy of the config with interpolated values. |
|
||
|
||
##### <sup><kbd>method</kbd> `Config.merge`</sup>
|
||
|
||
Deep-merge two config objects, using the current config as the default. Only
|
||
merges sections and dictionaries and not other values like lists. Values that
|
||
are provided in the updates are overwritten in the base config, and any new
|
||
values or sections are added. If a config value is a variable like
|
||
`${section.key}` (e.g. if the config was loaded with `interpolate=False)`, **the
|
||
variable is preferred**, even if the updates provide a different value. This
|
||
ensures that variable references aren’t destroyed by a merge.
|
||
|
||
> :warning: Note that blocks that refer to registered functions using the `@`
|
||
> syntax are only merged if they are referring to the same functions. Otherwise,
|
||
> merging could easily produce invalid configs, since different functions can
|
||
> take different arguments. If a block refers to a different function, it’s
|
||
> overwritten.
|
||
|
||
```python
|
||
from confection import Config
|
||
|
||
base_config_str = """
|
||
[training]
|
||
patience = 10
|
||
dropout = 0.2
|
||
"""
|
||
update_config_str = """
|
||
[training]
|
||
dropout = 0.1
|
||
max_epochs = 2000
|
||
"""
|
||
|
||
base_config = Config().from_str(base_config_str)
|
||
update_config = Config().from_str(update_config_str)
|
||
merged = Config(base_config).merge(update_config)
|
||
print(merged["training"]) # {'patience': 10, 'dropout': 0.1, 'max_epochs': 2000}
|
||
```
|
||
|
||
| Argument | Type | Description |
|
||
| ----------- | ------------------------------- | --------------------------------------------------- |
|
||
| `overrides` | `Union[Dict[str, Any], Config]` | The updates to merge into the config. |
|
||
| **RETURNS** | `Config` | A new config instance containing the merged config. |
|
||
|
||
### Config Attributes
|
||
|
||
| Argument | Type | Description |
|
||
| ----------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||
| `is_interpolated` | `bool` | Whether the config values have been interpolated. Defaults to `True` and is set to `False` if a config is loaded with `interpolate=False`, e.g. using `Config.from_str`. |
|