feature/#397 scenario duplication #2373

toan-quach · 2024-12-27T07:16:59Z

What type of PR is this? (check all applicable)

Description

Related Tickets & Documents

#397

github-actions · 2024-12-27T07:29:31Z

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines	Covered	Coverage	Threshold	Status
19683	17148	87%	0%	🟢

New Files

No new covered files...

Modified Files

File	Coverage	Status
taipy/core/data/_data_manager.py	98%	🟢
taipy/core/data/_file_datanode_mixin.py	98%	🟢
taipy/core/data/csv.py	98%	🟢
taipy/core/data/data_node.py	97%	🟢
taipy/core/data/excel.py	84%	🟢
taipy/core/data/json.py	97%	🟢
taipy/core/data/parquet.py	97%	🟢
taipy/core/data/pickle.py	100%	🟢
taipy/core/scenario/_scenario_manager.py	97%	🟢
taipy/core/task/_task_manager.py	97%	🟢
taipy/core/task/task.py	100%	🟢
TOTAL	97%	🟢

updated for commit: 190f6a6 by action🐍

tests/core/scenario/test_scenario_manager.py

jrobinAV · 2025-01-15T13:18:00Z

taipy/core/scenario/_scenario_manager.py

+        cloned_scenario_id = cloned_scenario._new_id(cloned_scenario.config_id)
+        cloned_scenario.id = cloned_scenario_id


Suggested change

cloned_scenario_id = cloned_scenario._new_id(cloned_scenario.config_id)

cloned_scenario.id = cloned_scenario_id

cloned_scenario.id = cloned_scenario._new_id(cloned_scenario.config_id)

jrobinAV · 2025-01-15T13:28:32Z

taipy/core/data/_data_manager.py

We should implement a can_duplicate method, returning reasons, just like we have a can_create method.

jrobinAV · 2025-01-15T14:12:59Z

taipy/core/data/_data_manager.py

+        cloned_dn = cls._get(dn)
+
+        cloned_dn.id = cloned_dn._new_id(cloned_dn._config_id)
+        cloned_dn._owner_id = cls._get_owner_id(cloned_dn._scope, cycle_id, scenario_id)
+        cloned_dn._parent_ids = set()
+
+        cls._set(cloned_dn)
+
+        cloned_dn._clone_data()


If the data node has a cycle scope, and if the new scenario is from the same cycle, then we want to share the same data node between scenarios.
If the scope is global it always already exists, and we also want to share the existing one.

jrobinAV · 2025-01-15T14:18:59Z

taipy/core/data/_file_datanode_mixin.py

+        if os.path.exists(self.path):
+            folder_path, base_name = os.path.split(self.path)
+            new_base_path = os.path.join(folder_path, f"TAIPY_CLONE_{id}_{base_name}")
+            if os.path.isdir(self.path):
+                shutil.copytree(self.path, new_base_path)
+            else:
+                shutil.copy(self.path, new_base_path)
+            return new_base_path
+        return ""


Do we want to differentiate the cases where the initial path is generated by Taipy or provided by the user?
I believe it would be better. If it is Taipy generated, we can just replace the old id by the new one. otherwise, adding a prefix or a suffix as you did make sense.

jrobinAV · 2025-01-15T14:21:49Z

taipy/core/scenario/_scenario_manager.py

+
+        cloned_additional_data_nodes = set()
+        for data_node in cloned_scenario.additional_data_nodes.values():
+            cloned_additional_data_nodes.add(_data_manager._clone(data_node, None, cloned_scenario_id))


Duplicating data nodes should depend on its scope.

jrobinAV · 2025-01-15T14:28:45Z

taipy/core/scenario/_scenario_manager.py

We should implement a can_duplicate method, returning reasons, just like we have a can_create method.

jrobinAV · 2025-01-15T14:30:29Z

taipy/core/scenario/_scenario_manager.py

@@ -521,3 +521,47 @@ def _get_by_config_id(cls, config_id: str, version_number: Optional[str] = None)
        for fil in filters:
            fil.update({"config_id": config_id})
        return cls._repository._load_all(filters)
+
+    @classmethod
+    def _clone(cls, scenario: Scenario) -> Scenario:


We should accept an optional name and an optional date in the method signature. It does not change much for the name but it does for the creation date. This has an impact on the potential cycle as the new scenario might be on a different cycle.

hm? I can understand the name, but the creation date?

Let's say I have a January scenario I am happy with. I want to start with a duplicate of this one to compute my February scenario. So beginning of February I duplicate my January scenario passing the current date so the new scenario is in the February cycle.

Does it make sense for you? And @FlorianJacta any opinion on that?

I agree; duplication for me was not about the creation date. Your use case is a real use case from CFM

jrobinAV · 2025-01-15T14:32:13Z

taipy/core/scenario/_scenario_manager.py

@@ -521,3 +521,47 @@ def _get_by_config_id(cls, config_id: str, version_number: Optional[str] = None)
        for fil in filters:
            fil.update({"config_id": config_id})
        return cls._repository._load_all(filters)
+
+    @classmethod
+    def _clone(cls, scenario: Scenario) -> Scenario:


According to the issue, we should also pass an optional list of data nodes or data node IDs. Without any list; we should copy all the data. If the list is provided, only the files of the data nodes in the list should be copied.

I'll leave it as the next step for now

jrobinAV · 2025-01-15T14:32:56Z

taipy/core/task/_task_manager.py

@@ -226,3 +231,22 @@ def _get_by_config_id(cls, config_id: str, version_number: Optional[str] = None)
        for fil in filters:
            fil.update({"config_id": config_id})
        return cls._repository._load_all(filters)
+
+    @classmethod
+    def _clone(cls, task: Task, cycle_id: Optional[CycleId] = None, scenario_id: Optional[ScenarioId] = None) -> Task:


I would rename _clone method to _duplicate as it does not return another instance of the same object. It returns a similar object with a few differences (ids, sub-entities' ids, paths, etc...)

jrobinAV · 2025-01-15T14:38:10Z

taipy/core/data/csv.py

+    def _clone_data(self):
+        new_data_path = self._clone_data_file(self.id)
+        self._properties[self._PATH_KEY] = new_data_path
+        return new_data_path


I would move that outside the data node, to put it in the data_manager.
Just like we handle the parent_ids and the owner_id in the manager, I would set the path property in the manager as well (still retrieving the value from a fileDatanodeMixing method).
This is debatable, though...

well this is specific for file DNs only, if we put it in the data manager, we're grouping it with other types of DNs like Sql or mongo, I don't think it's a good idea

toan-quach marked this pull request as draft December 27, 2024 07:17

toan-quach force-pushed the feature/#397-duplicate-scenarios branch from 658f02f to 2644479 Compare December 27, 2024 07:17

jrobinAV added Core Related to Taipy Core Core: Data node Core: 🎬 Scenario & Cycle 🟨 Priority: Medium Not blocking but should be addressed labels Jan 6, 2025

jrobinAV assigned toan-quach Jan 6, 2025

Toan Quach added 2 commits January 13, 2025 14:26

draft for scenario duplication

beee00d

added cloning data files

f79d591

toan-quach force-pushed the feature/#397-duplicate-scenarios branch from 2644479 to f79d591 Compare January 13, 2025 07:26

Toan Quach added 2 commits January 14, 2025 16:35

added tests for copying data files

85dac00

added tests for cloning entities

190f6a6

jrobinAV requested changes Jan 15, 2025

View reviewed changes

tests/core/scenario/test_scenario_manager.py Outdated Show resolved Hide resolved

fixed prevent replacing current in_memory entity with cloned entity

9639503

toan-quach marked this pull request as ready for review January 15, 2025 13:10

jrobinAV reviewed Jan 15, 2025

View reviewed changes

added checking existing task and cycle

190274a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature/#397 scenario duplication #2373

feature/#397 scenario duplication #2373

toan-quach commented Dec 27, 2024

github-actions bot commented Dec 27, 2024 •

edited

Loading

jrobinAV Jan 15, 2025

jrobinAV Jan 15, 2025

jrobinAV Jan 15, 2025

jrobinAV Jan 15, 2025

jrobinAV Jan 15, 2025

jrobinAV Jan 15, 2025

jrobinAV Jan 15, 2025

toan-quach Jan 16, 2025

jrobinAV Jan 16, 2025

FlorianJacta Jan 16, 2025

jrobinAV Jan 15, 2025

toan-quach Jan 16, 2025

jrobinAV Jan 15, 2025

jrobinAV Jan 15, 2025

toan-quach Jan 16, 2025

		cloned_scenario_id = cloned_scenario._new_id(cloned_scenario.config_id)
		cloned_scenario.id = cloned_scenario_id

	cloned_scenario_id = cloned_scenario._new_id(cloned_scenario.config_id)
	cloned_scenario.id = cloned_scenario_id
	cloned_scenario.id = cloned_scenario._new_id(cloned_scenario.config_id)

feature/#397 scenario duplication #2373

Are you sure you want to change the base?

feature/#397 scenario duplication #2373

Conversation

toan-quach commented Dec 27, 2024

What type of PR is this? (check all applicable)

Description

Related Tickets & Documents

github-actions bot commented Dec 27, 2024 • edited Loading

☂️ Python Coverage

Overall Coverage

New Files

Modified Files

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Dec 27, 2024 •

edited

Loading