Scale And Observe Your AI/ML Workflows: neptune.ai + Flyte & Union Integration


Within the machine studying (ML) and synthetic intelligence (AI) area, managing, monitoring, and visualizing mannequin coaching processes is a major problem as a result of scale and complexity of managed information, fashions, and assets.

Union, an optimized and extra performant model of the open-source resolution Flyte, offers scalability, declarative infrastructure, and information lineage, permitting AI builders to iterate and productionize AI or ML workflows shortly.

Neptune is an experiment tracker that enables AI researchers to watch their mannequin coaching in actual time, visualize and examine experiments, and collaborate on them with a staff. Like Union, Neptune excels in scalability, making it the best monitoring resolution for groups engaged on large-scale mannequin coaching. 

The brand new Neptune Flyte plugin allows you to use Neptune to trace, visualize, and handle your fashions. The plugin robotically logs Flyte’s execution metadata into Neptune and provides a hyperlink in Union’s UI to Neptune. On this weblog publish, you’ll discover ways to use the Neptune plugin on Union.

Orchestrate and monitor your fashions with Flytekit’s Neptune Plugin

In Union, information and compute are elementary constructing blocks for growing all workflows. You possibly can prepare fashions utilizing machine studying or AI libraries equivalent to PyTorch Lightning or XGBoost. Union is constructed on Flyte, which makes use of declarative orchestration to scale any computation simply.

On this first instance, flytekit’s neptune_init_run configures the Neptune run, and the PyTorch Lightning callback to robotically monitor the mannequin’s progress. With Flyte’s declarative infrastructure, you set accelerator=A100 to allocate an NVIDIA A100 GPU to run the coaching process with Lightning. The neptune_init_run decorator initializes a Neptune Run object and shops it into flyte’s context. 

With the plugin, Union’s execution web page now has a hyperlink that goes on to Neptune’s net app dashboard: 

Union dashboard

The Neptune Run object can be utilized straight or handed into a lot of Neptune’s integrations with machine studying libraries. For PyTorch Lightning, you need to use Neptune to track metrics during training:


Coaching metrics tracked in Neptune

Scale to a number of coaching duties with dynamic workflows

With Flyte’s dynamic workflows, you’ll be able to shortly scale as much as a number of coaching duties, every with its personal assets. On this instance, you see easy methods to use Flyte’s declarative infrastructure to coach varied fashions utilizing XGBoost. Just like the earlier instance, Flyte’s context offers a Neptune Run object which is handed to Neptune’s XGBoost integration.

Neptune’s XGBoost integration will automatically log metadata related to coaching the XGBoost mannequin.


Coaching metadata tracked in Neptune

Within the Union UI, the workflow dynamically scales out to a number of duties, every with a hyperlink to Neptune:

Union dashboard

Wrapping up

Union’s declarative infrastructure and scalable orchestration platform make it easy to scale up our machine studying or AI workflows and put them in manufacturing. With flytekit’s Neptune plugin, you’ll be able to simply monitor your experiments, visualize outcomes, and debug your fashions. Use the plugin by putting in it with pip set up flytekitplugins-neptune.

To study extra about Union, contact the staff at union.ai/demo.

To study extra about Neptune, get in contact with us at neptune.ai/contact-us.

Was the article helpful?

Thanks in your suggestions!

Thanks in your vote! It has been famous. | What matters you want to see in your subsequent learn?

Thanks in your vote! It has been famous. | Tell us what needs to be improved.

Thanks! Your recommendations have been forwarded to our editors

Discover extra content material matters:

Leave a Reply

Your email address will not be published. Required fields are marked *