Added my Dapt Code #273

Vaibhav-03 · 2025-11-09T01:09:58Z

Please share the changes or any issues u observe

…ethods

gemini-code-assist · 2025-11-09T01:10:31Z

Summary of Changes

Hello @Vaibhav-03, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the project's capabilities by introducing a complete pipeline for Domain-Adaptive Pretraining (DAPT) of large language models on financial data, specifically Llama 3.1 on earnings call transcripts. It also delivers a robust framework for evaluating various trading strategies, including both long-only and long-short approaches, against a set of established baselines. The changes include new scripts for DAPT training and evaluation, a modular evaluation package for backtesting and performance analysis, and enhancements to API interactions and configuration handling for improved security and functionality.

Highlights

Domain-Adaptive Pretraining (DAPT) for Llama 3.1: Introduced new Python scripts (Finllama/DAPT.py and Finllama/DAPT_Llama31_Transcripts.py) for performing Domain-Adaptive Pretraining of the Llama 3.1 model on stock earning call transcripts. This includes robust data loading, tokenization, and PEFT LoRA/QLoRA training setups.
DAPT Model Evaluation Framework: Implemented a dedicated evaluation script (Finllama/DAPT_evaluate.py) to compare the perplexity of the fine-tuned Llama 3.1 model against its baseline on financial transcripts, providing a quantitative measure of domain adaptation effectiveness.
Comprehensive Trading Strategy Evaluation Package: Added a new evaluation package, encompassing modules for defining baseline trading strategies (e.g., BuyAndHold, MACD), a backtesting engine, performance metrics calculation (e.g., Sharpe Ratio, Max Drawdown), and visualization tools for comparing strategy performance.
Long-Short Strategy Evaluation Support: Introduced a parallel evaluation_long_short package, specifically designed to support and evaluate long-short trading strategies, providing a distinct framework from the long-only evaluation.
Enhanced Configuration and OpenAI API Integration: Improved configuration management by integrating python-dotenv for secure loading of API keys from .env files and refined OpenAI API calls in tradingagents/dataflows/openai.py for more precise news retrieval and robust response parsing.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a substantial amount of new functionality, including DAPT model training scripts and a comprehensive evaluation framework. The addition of the evaluation framework is a great step towards systematically measuring performance. I also appreciate the improvements in configuration management, such as using .env files for API keys.

However, there are several critical issues that need attention. The most significant is the large-scale code duplication between the evaluation and evaluation_long_short directories, which will create a maintenance burden. I strongly recommend refactoring this to eliminate redundancy. Additionally, I've found hardcoded secrets and absolute paths in the new scripts, which pose security and portability risks. There are also some instances of risky coding practices, like suppressing all exceptions, that should be addressed. My detailed comments provide specific suggestions for these points.

gemini-code-assist · 2025-11-09T01:12:24Z

evaluation_long_short/__init__.py

+from .baseline_strategies import (
+    BuyAndHoldStrategy,
+    MACDStrategy,
+    KDJRSIStrategy,
+    ZMRStrategy,
+    SMAStrategy,
+    get_all_baseline_strategies
+)
+
+from .metrics import (
+    calculate_cumulative_return,
+    calculate_annualized_return,
+    calculate_sharpe_ratio,
+    calculate_maximum_drawdown,
+    calculate_all_metrics,
+    create_comparison_table
+)
+
+from .backtest import (
+    BacktestEngine,
+    TradingAgentsBacktester,
+    load_stock_data
+)
+
+from .visualize import (
+    plot_cumulative_returns,
+    plot_transaction_history,
+    plot_metrics_comparison,
+    plot_drawdown,
+    create_summary_report
+)
+
+from .run_evaluation import run_evaluation
+
+__all__ = [
+    # Strategies
+    'BuyAndHoldStrategy',
+    'MACDStrategy',
+    'KDJRSIStrategy',
+    'ZMRStrategy',
+    'SMAStrategy',
+    'get_all_baseline_strategies',
+
+    # Metrics
+    'calculate_cumulative_return',
+    'calculate_annualized_return',
+    'calculate_sharpe_ratio',
+    'calculate_maximum_drawdown',
+    'calculate_all_metrics',
+    'create_comparison_table',
+
+    # Backtesting
+    'BacktestEngine',
+    'TradingAgentsBacktester',
+    'load_stock_data',
+
+    # Visualization
+    'plot_cumulative_returns',
+    'plot_transaction_history',
+    'plot_metrics_comparison',
+    'plot_drawdown',
+    'create_summary_report',
+
+    # Main evaluation
+    'run_evaluation',
+]


This entire evaluation_long_short directory appears to be a copy of the evaluation directory. This massive code duplication is a critical maintainability issue. Any bug fix or enhancement will need to be applied in two places, which is error-prone and inefficient. Please refactor this to avoid duplication. The different backtesting logic (long-only vs. long-short) could be handled more elegantly, for instance by:

Passing a strategy type ('long_only' or 'long_short') as a parameter to the backtesting engine.

Using subclassing for different strategy types.

gemini-code-assist · 2025-11-09T01:12:24Z

Finllama/DAPT.py

+os.environ["HF_DATASETS_CACHE"] = os.path.join(TMP_DIR, "datasets_cache")
+os.environ["HF_HUB_DISABLE_TELEMETRY"] = "1"
+
+os.environ["HF_TOKEN"] = "hf_xxx"  # replace with your HF token


Hardcoding placeholder tokens in the code is a security risk. Please load this from an environment variable. This prevents accidental exposure of real tokens.

Suggested change

os.environ["HF_TOKEN"] = "hf_xxx" # replace with your HF token

os.environ["HF_TOKEN"] = os.getenv("HF_TOKEN") # replace with your HF token

gemini-code-assist · 2025-11-09T01:12:24Z

Finllama/DAPT_Llama31_Transcripts.py

+os.environ["HF_DATASETS_CACHE"] = os.path.join(TMP_DIR, "datasets_cache")
+os.environ["HF_HUB_DISABLE_TELEMETRY"] = "1"
+
+os.environ["HF_TOKEN"] = "hf_token"


A hardcoded placeholder token is present. This is a security risk and should be avoided. Please load the token from an environment variable to keep secrets out of the codebase.

Suggested change

os.environ["HF_TOKEN"] = "hf_token"

os.environ["HF_TOKEN"] = os.getenv("HF_TOKEN")

gemini-code-assist · 2025-11-09T01:12:24Z

Finllama/DAPT_Llama31_Transcripts.py

+    "python": platform.python_version(),
+})
+
+from datasets import load_datasets


There is a typo in the import statement. The function from the datasets library is load_dataset (singular), not load_datasets. This will cause an ImportError at runtime.

Suggested change

from datasets import load_datasets

from datasets import load_dataset

gemini-code-assist · 2025-11-09T01:12:24Z

Finllama/DAPT.py

+# %%
+from huggingface_hub import InferenceClient
+
+client = InferenceClient("meta-llama/Llama-3.1-8B", token="hf_xxx")  # replace with your HF token


Hardcoded secrets, even placeholders like "hf_xxx", should not be committed to the source code. This is a security risk as it could be accidentally replaced with a real token. It's better to load secrets from environment variables or a configuration file that is not checked into version control. I see that tradingagents/default_config.py now uses python-dotenv, which is a great pattern to follow here as well.

Suggested change

client = InferenceClient("meta-llama/Llama-3.1-8B", token="hf_xxx") # replace with your HF token

client = InferenceClient("meta-llama/Llama-3.1-8B", token=os.getenv("HF_TOKEN")) # replace with your HF token

gemini-code-assist · 2025-11-09T01:12:25Z

tradingagents/dataflows/openai.py

+        except Exception:
+            pass


This try...except Exception: pass block silently swallows all exceptions. This is risky as it can hide parsing errors or unexpected changes in the API response format, making debugging very difficult. At a minimum, you should log the exception to have a record of when and why parsing failed.

Suggested change

except Exception:

pass

except Exception as e:

import logging

logging.warning(f"Failed to parse structured output from API response: {e}")

gemini-code-assist · 2025-11-09T01:12:25Z

Finllama/DAPT_Llama31_Transcripts.py

+
+from datasets import load_datasets
+from typing import Optional
+import pandas as pds


The standard and widely accepted alias for the pandas library is pd. Using a non-standard alias like pds can reduce code readability and confuse other developers who are used to the conventional alias.

Suggested change

import pandas as pds

import pandas as pd

gemini-code-assist · 2025-11-09T01:12:25Z

evaluation_buy_sell/visualize.py

+import warnings
+import json
+
+warnings.filterwarnings('ignore')


Using warnings.filterwarnings('ignore') suppresses all warnings globally, which can hide important issues and make debugging harder. It's better to address warnings directly. If a specific warning is unavoidable and known to be benign, it should be suppressed with a more targeted filter.

gemini-code-assist · 2025-11-09T01:12:25Z

evaluation_long_short/baseline_strategies.py

+
+    @staticmethod
+    def _actions_to_position(actions: pd.Series) -> pd.Series:
+        """Convert action series to a long-only position series in {0,1}."""


The docstring is incorrect. It states that this method converts actions to a "long-only position series in {0,1}", but the implementation actually supports long (1), short (-1), and flat (0) positions. The docstring should be updated to accurately reflect the function's behavior.

Suggested change

"""Convert action series to a long-only position series in {0,1}."""

"""Convert action series to a position series in {-1, 0, 1} for long/short/flat."""

gemini-code-assist · 2025-11-09T01:12:25Z

evaluation_buy_sell/run_evaluation.py

+import json
+
+# Add parent directory to path
+sys.path.insert(0, str(Path(__file__).parent.parent))


Modifying sys.path manually is generally considered a code smell and can lead to import issues, especially in larger projects. A better approach is to structure the project as a proper Python package (e.g., with a pyproject.toml file). This allows you to install the project in editable mode (pip install -e .), which handles Python's path correctly and makes imports more reliable.

…n_buy_sell/

quanliangliu and others added 7 commits October 31, 2025 10:50

Modified evaluation scripts

b512027

Baseline evaluation logics added

c29ca88

global news OpenAI bug fixed

e42a7ed

Modified evaluation logic for tradingagent to align with rule-based m…

ffff305

…ethods

Visualization added

e99138f

change to longShort

7df8ae2

Added DAPT files

7fa325e

gemini-code-assist bot reviewed Nov 9, 2025

View reviewed changes

Vaibhav-03 and others added 8 commits November 8, 2025 19:25

added requirements

0631a4f

Plan updates to README.md

fc29f09

fix directory generalization issue

c4b0aa6

Applied DAPT model to tradingagent graph

e3952ed

fixed long-short evaluation

55e0287

Replace old evaluation/ with evaluation_long_short/ and add evaluatio…

1c6d31f

…n_buy_sell/

added SFT files

287764c

Added test dapt files

948ad66

	os.environ["HF_TOKEN"] = "hf_xxx" # replace with your HF token
	os.environ["HF_TOKEN"] = os.getenv("HF_TOKEN") # replace with your HF token

	os.environ["HF_TOKEN"] = "hf_token"
	os.environ["HF_TOKEN"] = os.getenv("HF_TOKEN")

	from datasets import load_datasets
	from datasets import load_dataset

	client = InferenceClient("meta-llama/Llama-3.1-8B", token="hf_xxx") # replace with your HF token
	client = InferenceClient("meta-llama/Llama-3.1-8B", token=os.getenv("HF_TOKEN")) # replace with your HF token

-        except Exception:
-            pass
+        except Exception as e:
+            import logging
+            logging.warning(f"Failed to parse structured output from API response: {e}")

	"""Convert action series to a long-only position series in {0,1}."""
	"""Convert action series to a position series in {-1, 0, 1} for long/short/flat."""

Added my Dapt Code #273

Are you sure you want to change the base?

Added my Dapt Code #273

Uh oh!

Conversation

Vaibhav-03 commented Nov 9, 2025

Uh oh!

gemini-code-assist bot commented Nov 9, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants