oneAPI-samples icon indicating copy to clipboard operation
oneAPI-samples copied to clipboard

Use uv tool for isolate cicd env

Open bopeng1234 opened this issue 10 months ago • 8 comments

Existing Sample Changes

Description

The UV tool is designed to streamline the management of Python environments for multiple test cases. One of its standout features is its ability to operate without altering the existing Python environment, ensuring that the each unit test sample (sample.json) won't affect primary setup (AI Tool did with conda environment). This makes it an ideal solution without the hassle of environment conflicts or dependencies issues, and makes each sample isolation.

add uv tool management for samples:

1.AI-and-Analytics/End-to-end-Workloads/JobRecommendationSystem 
2.AI-and-Analytics/Features-and Functionality/INC_QuantizationAwareTraining_TextClassification 
3.AI-and-Analytics/Features-and-Functionality/Intel_Extension_For_SKLearn_Performance_SVC_Adult 
4.AI-and-Analytics/Features-and-Functionality/IntelPython_daal4py_DistributedKMeans 
5.AI-and-Analytics/Features-and-Functionality/IntelPython_daal4py_DistributedLinearRegression 
6.AI-and-Analytics/Features-and-Functionality/IntelPython_GPU_dpnp_Genetic_Algorithm 
7.AI-and-Analytics/Features-and-Functionality/IntelPython_Numpy_Numba_dpnp_kNN 
8.AI-and-Analytics/Features-and-Functionality/IntelPython_XGBoost_Performance 
9.AI-and-Analytics/Features-and-Functionality/IntelPyTorch_GPU_InferenceOptimization_with_AMP 
10.AI-and-Analytics/Features-and-Functionality/IntelPyTorch_TrainingOptimizations_AMX_BF16 
11.AI-and-Analytics/Features-and-Functionality/IntelTensorFlow_AMX_BF16_Inference 
12.AI-and-Analytics/Features-and-Functionality/IntelTensorFlow_AMX_BF16_Training 
13.AI-and-Analytics/Features-and-Functionality/IntelTensorFlow_Enabling_Auto_Mixed_Precision_for_TransferLearning 
14.AI-and-Analytics/Features-and-Functionality/IntelTensorFlow_for_LLMs 
15.AI-and-Analytics/Features-and-Functionality/IntelTensorFlow_TextGeneration_with_LSTM 
16.AI-and-Analytics/Features-and-Functionality/IntelTransformers_Quantization 
17.AI-and-Analytics/Getting-Started-Samples/INC-Quantization-Sample-for-PyTorch 
18.AI-and-Analytics/Getting-Started-Samples/INC-Sample-for-Tensorflow 
19.AI-and-Analytics/Getting-Started-Samples/Intel_Extension_For_SKLearn_GettingStarted 
20.AI-and-Analytics/Getting-Started-Samples/Intel_Extension_For_TensorFlow_GettingStarted 
21.AI-and-Analytics/Getting-Started-Samples/IntelPython_daal4py_GettingStarted 
22.AI-and-Analytics/Getting-Started-Samples/IntelPython_XGBoost_GettingStarted 
23.AI-and-Analytics/Getting-Started-Samples/Modin_GettingStarted 
24.AI-and-Analytics/Getting-Started-Samples/Modin_Vs_Pandas 

Type of change

Please delete options that are not relevant. Add a 'X' to the one that is applicable.

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [x] Implement fixes for ONSAM Jiras

ONSAM 1917

bopeng1234 avatar Mar 03 '25 07:03 bopeng1234

Left a few comments mostly around:

  • Activating conda environment
  • ipykernel as dev dependency
  • Why is numpy installed separately?

Furthermore, I don't see pyproject.toml and uv.lock files. The reproducibility aspect of UV is totally missed, this PR only uses UV as a pip replacement. Ideally, we want to use UV as a complete package manager.

The reasons why we use AI Tool's Conda environments alongside uv:

  1. Isolation and Reusability uv allows us to maintain a local venv environment while also leveraging packages from the Conda environment installed by AI Tool. These two environments remain independent, ensuring that uv run can utilize both local dependencies and Conda-installed packages such as PyTorch and TensorFlow. Additionally, uv sync or uv add does not modify the system environment, achieving both isolation (sample's requirements) and package reuse (AI Tool's ipex/itex).

  2. CI Validation of AI Tool + Sample The purpose of CI (Continuous Integration) is to verify that AI Tool and its associated samples work correctly. we need to use the environment provided by AI Tool, to validate the samples works well with it.

  3. Avoiding Redundant Installations Without using the AI Tool environment, every sample would need to install pytorch, tensorflow (which is from AI Tool Conda env) separately. This results in duplicate installations, excessive resource consumption, and significantly increases CI run time.

bopeng1234 avatar Mar 07 '25 09:03 bopeng1234

ipykernel issue is resolved by removing --dev numpy separately installation also removed.

bopeng1234 avatar Mar 07 '25 09:03 bopeng1234

This PR isn’t just about using uv as a simple replacement for pip. Instead, it takes advantage of AI Tool’s base image while allowing each sample to create its own minimal local venv. This ensures that every sample remains independent while still reusing the system environment, ultimately enabling CI to validate that AI Tool + sample work correctly.

bopeng1234 avatar Mar 07 '25 09:03 bopeng1234

Thanks for the review @Ankur-singh , we refactor the code and added pyproject.toml and uv sync

bopeng1234 avatar Mar 07 '25 09:03 bopeng1234

This PR isn’t just about using uv as a simple replacement for pip. Instead, it takes advantage of AI Tool’s base image while allowing each sample to create its own minimal local venv. This ensures that every sample remains independent while still reusing the system environment, ultimately enabling CI to validate that AI Tool + sample work correctly.

This makes sense. The PR looks much better now. Thank you so much for all the work.

There is one major question that needs to be answered. Next oneAPI release for AI tools will not have offline installer. All the packages will either be distributed via apt-get or pip. This change is quite significant for two reasons:

  1. We won't have conda environments to start with.
  2. UV is good for python packages but not OS level packages.

We will have to give some thought. I will be happy to hear your ideas.

cc @jimmytwei

Ankur-singh avatar Mar 07 '25 21:03 Ankur-singh

Oh, I see! I wasn’t aware that the Conda environment would be removed in the next AI Tool release. If that’s the case, the changes in this PR will be suitable for the current release but could also serve as a reference for the next version.

For each sample, use uv to install the required packages individually, such as PyTorch, TensorFlow, and other dependencies. Similar to this PR, the automatically generated pyproject.toml should be pushed to the repository. In sample.json, sync the packages from pyproject.toml using uv sync to create an isolated environment, and use uv run to execute each sample independently.

bopeng1234 avatar Mar 10 '25 02:03 bopeng1234

I'm not sure if all the required packages for each sample can already be installed via apt or pip. If they can, we can refactor the uv workflow accordingly and remove the Conda-related components from the current AI Tool.

bopeng1234 avatar Mar 10 '25 02:03 bopeng1234

If that’s the case, the changes in this PR will be suitable for the current release but could also serve as a reference for the next version.

Yes, I think we should be doing it incrementally.

I'm not sure if all the required packages for each sample can already be installed via apt or pip.

I'm not sure about it either, we will have to wait for the next release. @jimmytwei do you have any insights about this?

Ankur-singh avatar Mar 11 '25 15:03 Ankur-singh