Skip to content

Created base structure for adding discovery and other algorithms #68

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

amit-sharma
Copy link
Member

@amit-sharma amit-sharma commented Jul 20, 2025

New files added that convey the directory structure.

This PR does not add any functionality, just the protocol class and structure. The goal is to enable collaborators to contribute code.

@amit-sharma amit-sharma requested a review from Copilot July 20, 2025 07:43
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR establishes the foundational structure for a causal discovery library by creating base interfaces and utility functions. The changes introduce a protocol-based design for datasets and skeleton implementations for evaluation metrics.

  • Defines a Dataset protocol with methods for graph access, data retrieval, and synthetic data generation
  • Creates placeholder functions for standard causal discovery evaluation metrics (accuracy, precision, recall, F1)

Reviewed Changes

Copilot reviewed 2 out of 8 changed files in this pull request and generated 4 comments.

File Description
pywhyllm/datasets/dataset.py Defines the Dataset protocol interface with methods for graph, data, and synthetic data generation
pywhyllm/datasets/metrics.py Creates skeleton functions for evaluation metrics used in causal discovery
Comments suppressed due to low confidence (1)

pywhyllm/datasets/metrics.py:21

  • Function name 'F1' should follow Python naming conventions. Consider renaming to 'f1_score' or 'f1' (lowercase).
def F1(edges, true_edges)

amit-sharma and others added 4 commits July 20, 2025 13:14
Co-authored-by: Copilot <[email protected]>
Signed-off-by: Amit Sharma <[email protected]>
Co-authored-by: Copilot <[email protected]>
Signed-off-by: Amit Sharma <[email protected]>
Co-authored-by: Copilot <[email protected]>
Signed-off-by: Amit Sharma <[email protected]>
Co-authored-by: Copilot <[email protected]>
Signed-off-by: Amit Sharma <[email protected]>
@amit-sharma amit-sharma requested a review from emrekiciman July 21, 2025 04:56
Copy link
Member

@emrekiciman emrekiciman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. I added a few questions about the structure.

Also: do we want this dataset protocol to integrate well with data handling in other pywhyllm libraries? Or is this only for self-contained pywhy-llm benchmarking for example?


class Dataset(Protocol):

def graph(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to return the ground truth graph, or is this function intended to execute the PyWhyLLM functions to derive a candidate graph?

"""
pass

def generate_data(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about merging data() and generate_data() into a single function? Then some Dataset objects might be synthetic Datasets, and some might be grounded datasets (real data)?

@@ -0,0 +1,25 @@


def accuracy(edges, true_edges):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given edges and true_edges, is the calculation of accuracy, precision, and recall specific to the dataset?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants