Skip to content

Project

kedro.framework.project

kedro.framework.project module provides utility to configure a Kedro project and access its settings.

Function Description
configure_logging Configure logging according to the logging_config dictionary.
configure_project Configure a Kedro project by populating its settings with values defined in settings.py and pipeline_registry.py.
find_pipelines Automatically find modular pipelines having a create_pipeline function.
validate_settings Eagerly validate that the settings module is importable if it exists.

kedro.framework.project.configure_logging

configure_logging(logging_config)

Configure logging according to logging_config dictionary.

Source code in kedro/framework/project/__init__.py
317
318
319
def configure_logging(logging_config: dict[str, Any]) -> None:
    """Configure logging according to ``logging_config`` dictionary."""
    LOGGING.configure(logging_config)

kedro.framework.project.configure_project

configure_project(package_name)

Configure a Kedro project by populating its settings with values defined in user's settings.py and pipeline_registry.py.

Source code in kedro/framework/project/__init__.py
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
def configure_project(package_name: str) -> None:
    """Configure a Kedro project by populating its settings with values
    defined in user's settings.py and pipeline_registry.py.
    """
    settings_module = f"{package_name}.settings"
    settings.configure(settings_module)

    pipelines_module = f"{package_name}.pipeline_registry"
    pipelines.configure(pipelines_module)

    # Once the project is successfully configured once, store PACKAGE_NAME as a
    # global variable to make it easily accessible. This is used by validate_settings()
    # below, and also by ParallelRunner on Windows, as package_name is required every
    # time a new subprocess is spawned.
    global PACKAGE_NAME  # noqa: PLW0603
    PACKAGE_NAME = package_name

    if PACKAGE_NAME:
        LOGGING.set_project_logging(PACKAGE_NAME)

kedro.framework.project.find_pipelines

find_pipelines(raise_errors=False)

Automatically find modular pipelines having a create_pipeline function. By default, projects created using Kedro 0.18.3 and higher call this function to autoregister pipelines upon creation/addition.

Projects that require more fine-grained control can still define the pipeline registry without calling this function. Alternatively, they can modify the mapping generated by the find_pipelines function.

For more information on the pipeline registry and autodiscovery, see https://docs.kedro.org/en/stable/nodes_and_pipelines/pipeline_registry.html

Parameters:

  • raise_errors (bool, default: False ) –

    If True, raise an error upon failed discovery.

Returns:

  • dict[str, Pipeline]

    A generated mapping from pipeline names to Pipeline objects.

Raises:

  • ImportError

    When a module does not expose a create_pipeline function, the create_pipeline function does not return a Pipeline object, or if the module import fails up front. If raise_errors is False, see Warns section instead.

Warns:

  • UserWarning

    When a module does not expose a create_pipeline function, the create_pipeline function does not return a Pipeline object, or if the module import fails up front. If raise_errors is True, see Raises section instead.

Source code in kedro/framework/project/__init__.py
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
def find_pipelines(raise_errors: bool = False) -> dict[str, Pipeline]:  # noqa: PLR0912
    """Automatically find modular pipelines having a ``create_pipeline``
    function. By default, projects created using Kedro 0.18.3 and higher
    call this function to autoregister pipelines upon creation/addition.

    Projects that require more fine-grained control can still define the
    pipeline registry without calling this function. Alternatively, they
    can modify the mapping generated by the ``find_pipelines`` function.

    For more information on the pipeline registry and autodiscovery, see
    https://docs.kedro.org/en/stable/nodes_and_pipelines/pipeline_registry.html

    Args:
        raise_errors: If ``True``, raise an error upon failed discovery.

    Returns:
        A generated mapping from pipeline names to ``Pipeline`` objects.

    Raises:
        ImportError: When a module does not expose a ``create_pipeline``
            function, the ``create_pipeline`` function does not return a
            ``Pipeline`` object, or if the module import fails up front.
            If ``raise_errors`` is ``False``, see Warns section instead.

    Warns:
        UserWarning: When a module does not expose a ``create_pipeline``
            function, the ``create_pipeline`` function does not return a
            ``Pipeline`` object, or if the module import fails up front.
            If ``raise_errors`` is ``True``, see Raises section instead.
    """
    pipeline_obj = None

    # Handle the simplified project structure found in several starters.
    pipeline_module_name = f"{PACKAGE_NAME}.pipeline"
    try:
        pipeline_module = importlib.import_module(pipeline_module_name)
    except Exception as exc:
        if str(exc) != f"No module named '{pipeline_module_name}'":
            if raise_errors:
                raise ImportError(
                    f"An error occurred while importing the "
                    f"'{pipeline_module_name}' module."
                ) from exc

            warnings.warn(
                IMPORT_ERROR_MESSAGE.format(
                    module=pipeline_module_name, tb_exc=traceback.format_exc()
                )
            )
    else:
        pipeline_obj = _create_pipeline(pipeline_module)

    pipelines_dict = {"__default__": pipeline_obj or pipeline([])}

    # Handle the case that a project doesn't have a pipelines directory.
    try:
        pipelines_package = importlib.resources.files(f"{PACKAGE_NAME}.pipelines")  # type: ignore[attr-defined]
    except ModuleNotFoundError as exc:
        if str(exc) == f"No module named '{PACKAGE_NAME}.pipelines'":
            return pipelines_dict

    for pipeline_dir in pipelines_package.iterdir():
        if not pipeline_dir.is_dir():
            continue

        pipeline_name = pipeline_dir.name
        if pipeline_name == "__pycache__":
            continue
        # Prevent imports of hidden directories/files
        if pipeline_name.startswith("."):
            continue

        pipeline_module_name = f"{PACKAGE_NAME}.pipelines.{pipeline_name}"
        try:
            pipeline_module = importlib.import_module(pipeline_module_name)
        except Exception as exc:
            if raise_errors:
                raise ImportError(
                    f"An error occurred while importing the "
                    f"'{pipeline_module_name}' module."
                ) from exc

            warnings.warn(
                IMPORT_ERROR_MESSAGE.format(
                    module=pipeline_module_name, tb_exc=traceback.format_exc()
                )
            )
            continue

        pipeline_obj = _create_pipeline(pipeline_module)
        if pipeline_obj is not None:
            pipelines_dict[pipeline_name] = pipeline_obj
    return pipelines_dict

kedro.framework.project.validate_settings

validate_settings()

Eagerly validate that the settings module is importable if it exists. This is desirable to surface any syntax or import errors early. In particular, without eagerly importing the settings module, dynaconf would silence any import error (e.g. missing dependency, missing/mislabelled pipeline), and users would instead get a cryptic error message Expected an instance of `ConfigLoader`, got `NoneType` instead. More info on the dynaconf issue: https://github.com/dynaconf/dynaconf/issues/460

Source code in kedro/framework/project/__init__.py
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
def validate_settings() -> None:
    """Eagerly validate that the settings module is importable if it exists. This is desirable to
    surface any syntax or import errors early. In particular, without eagerly importing
    the settings module, dynaconf would silence any import error (e.g. missing
    dependency, missing/mislabelled pipeline), and users would instead get a cryptic
    error message ``Expected an instance of `ConfigLoader`, got `NoneType` instead``.
    More info on the dynaconf issue: https://github.com/dynaconf/dynaconf/issues/460
    """
    if PACKAGE_NAME is None:
        raise ValueError(
            "Package name not found. Make sure you have configured the project using "
            "'bootstrap_project'. This should happen automatically if you are using "
            "Kedro command line interface."
        )
    # Check if file exists, if it does, validate it.
    if importlib.util.find_spec(f"{PACKAGE_NAME}.settings") is not None:
        importlib.import_module(f"{PACKAGE_NAME}.settings")
    else:
        logger = logging.getLogger(__name__)
        logger.warning("No 'settings.py' found, defaults will be used.")