[EPIC] Make $ trading across 500 tokens; 5m; trade rarely per token. Sim, and live #1712

trentmc · 2025-01-25T09:20:07Z

Background / motivation

Make $ trading on 5m, a timescale good enough for agent evolution. (And more fun in general:)

How:

trade across 500+ tokens
trade rarely per token. Eg only 1-3 trades every 5m. Aka "snipe". Few relative trades → low fees.
have a framework where I can test super-fast. Chop out the rest. Ideally, fast enough to evolve with trading-sim-in-the-loop
approach makes $: tune params (eg conf/TP/SL); and bigger changes (eg model "up > 0.2%?" and "down <0.2%?")

Phases / tasks

(completed phases are farther below)

Phase: build / improve until "make $ on 0.035% fees (HL starter rate), on BTC, 3mo sim

Phase: like ^, but now 2y sim

Phase: like ^, but now make $ on 8 of top-10 tokens

Test: run benchmark for each of top-10 separately, record results
(add tasks until goal met)

Phase: a single run trades on 500 tokens at once

Update sim etc to consider top-10 tokens at once
Update sim etc to consider top-500 tokens at once
(add tasks until goal met)

Completed phases / tasks

Phase: init experiments

Revive pdr-backend sim, on 5m, btc, binance. Backend/cli, sim, and new dashboard too.
Conduct cheap experiment: set params to trade rarely (high confidence_thr), and run for a long time. Fiddle with different philosophies. Results in next steps.
Q: make $, with no fees? A: yes. See comment 2. 227.2% APY
Q: make $ with nonzero fees (and init params)? See comment 4
- Q: $ with 0.1% fees? (Binance starter rate) A: no
- Q: $ with 0.025% fees? (HL moderate rate) A: no
- Q $ with 0.01% fees? (2x lower than lowest Binance/HL) A: yes
Q: on 0.025% fees (HL moderate rate), make $? (Tune params as needed.) A: yes. 20.5% APY. See comment 5.
Try: Improve classifier calibration accuracy (with lin model). Status quo: 5-fold CV on CalibratedClassifierCV_Sigmoid. Try: 10-fold, 100-fold

Phase: reduce complexity & speed runtime

Create new branch "tdr" link
Simplify codebase: chop pdr-backend code that I won't use. Basically, everything but what sim/ needs. That is, delete: smart contracts/web3, barge, dashboard/plots/analytics, predictoor/, trader/, DF rewards, trueval, ETL/Duck, more.
Try to make code super-fast, just for trading. Result: no big wins possible. Details

Appendix: things to add, only if we need, to make $

Fast model test/tune flow. Steps: (a) New sim param that saves X/y right before model-build. (b) Do a sim run with train_every_n_epochs = 10000; test_n = 100,000; transform = None (c) Do another sim run, with transform = center_on_recent (d) add saved X/y to a new dir, in github repo (e) Build a benchmarking framework within aimodel factory tests - leverage existing
Manual tune xgboost params. Draw on new fast model test/tune flow to find more ideal xgboost params. Docs. Guides: Analytics Vidhya, RITHP
Auto-tune xgboost params. Use the approach that Udit used
Model "up > 0.2%?" and "down <0.2%?"*.
- This is a classifier-based approach for the model to account for fees
- Specifically, reframe models to: "If prediction says 'price goes up > 0.2% anytime in the next 5 min', then buy; and as soon as actual price > 0.2% above, then sell. And, vice versa for going down. And, if still in position at 5min mark, then exit position".
- This was #1278. See PRs reframe2, reframe3.
- Those PRs ran into complexity issues. The codebase was getting too big & crazy, and hard to iterate. So if we want to pursue this, we need to simplify the codebase. How: focus on trading, chop out unneeded
- Another idea: just as we'll have separate loops for different tokens, we can have separate loops for up-vs-down models. Will keep things simpler. BUT don't do that, because we can take advantage of 2 models on the same output: if they disagree, then skip. i.e. Only proceed if they agree. It's another gate for confidence-building. reframe2/3 above has this.
Predict profitability; only trade when profitable.
- This is regressor-based approach for the model to account for fees.
- First, build an AI model that predicts continuous-value prices. From the price prediction, have a simple function to compute expected profitability (accounting for order fees, maybe slippage). Include uncertainty. For each pair, sweep across different long/short levels and compute profitability for each. Only trade when expected profitability > 0; or 95% lower bound > 0, or other thr.
- Concern: regression models so far have poor performance. Because they have to do more work to model more stuff, including stuff we don't need (compared to classifiers). If we do it, we'd probably need nonlinear models, Udit's trick, more.
Use "# orders" as an input to model. make $ trading GDoc
Udit's non-stationary trick, via quantiles. To properly model longer-term historical data. Though we can probably get aware with short-term data & models, given so many tokens. (And it'd be very expensive to do longer-term anyway.)
Trade on cowswap only when fee=zero thanks to coincidence-of-wants. Sub-q: where do we get historical data, and what do we model? Maybe the A is simply: just use Binance?
Ref: make $ trading GDoc. A bit dated, but many great ideas.
Ref: 5m trading ideas in 1e 9 GDoc

Appendix: Fees

Binance & HL fee schedules

trentmc · 2025-01-26T07:54:58Z

Hypothesis: "without the following model req'ts, model has no hope":

use price quantiles, not raw values (Udit trick)
at least weakly nonlinear (FFX)
reliable confidence calc --> critical for choosing when to trade
bound amount traded, using confidence calc
bound amount made & lost (reduce variance) -> take profit / stop loss

If I can make $ without needing all of these, this hyp is wrong

trentmc · 2025-01-26T08:16:10Z

Results: 0 fees. Making $

I'm making money; convergence curve is excellent. It did this without the model req'ts of prev comment. (Thereby invalidating the prev comment's hypothesis.)

Biggest factors: very small SL/TP thr, and much larger autoregressive_n. Reasoning behind each param is below.

Convergence curve. It's excellent: steady up.

From $1000 in, made $325 in 90d. APY = (1+325/90/1000)^365-1 = 2.272 = 227%

All key params, and the reasoning behind each param (important!).

    max_n_train: 5700 # 20d. "Short enough to not get killed by nonstationarity. Long enough for a non-shitty lin model"
    autoregressive_n: 12 # 1h. Enough to "spot a pattern", vs eg 2 where no chance
..    
  aimodel_ss:
    approach: ClassifLinearRidge # Stable, don't harshly filter any prev param
    weight_recent: 10x_5x # "Bias most recent samples"
    balance_classes: None # "KISS" (maybe balancing helps more thogh)
    calibrate_probs: CalibratedClassifierCV_Sigmoid # "best possible confidence calc"
    train_every_n_epochs: 1 # "don't scrimp on model-building compute, let's have the best models for params above"
..

trader_ss:
..
    fee_percent: 0.0 # set to zero now to see if any chance at all making $. If success, then set to >0
    confidence_threshold: 0.75 # "snipe"
    stop_loss_percent: 0.001 # 0.1%. Reduce variance, don't let 1 trade kill us
    take_profit_percent: 0.001 # 0.1%.

Reduce variance, take our win and move on

sim_ss:
..
  test_n: 26275 # 3mo

Full settings file:my_ppss.yaml.txt

Further results

Interesting: predictoors don't do that well. And the model itself isn't great. See below.

But I've tuned parameters for trading (only trade when model is confident; bound the wins & losses), and that's what made all the difference.

Predictoor curve

Model performance

trentmc · 2025-01-26T08:30:46Z

On fees

Q: OK, we've shown we can make $ with no fees. What's the next step? A: Test with realistic trading fee.

Info: maker/taker spot fees are based on 30d trade volume. Maker = sets limit order, taker = uses market price. Analysis below is on taker fees.

Our expected trade volumes

What 30d vol can I realistically expect, when going "full bore"?

500 tokens
$10K / trade
trade 10% of the time for each token
43200min in 30d -> 8600ep
total vol = 500 * 10e3 * 0.1 * 8600 = 4.300e9 = $4.3B monthly vol (whoa)

More conservative numbers:

50 tokens
$2K / trade
trade 2% of the time for each token
43200min in 30d -> 8600ep
total vol = 50 * 2e3 * 0.02 * 8600 = $17.2M

Binance spot fees

0.1% for <$20M vol
0.06% for $20M - $75M vol
0.031% for $75M - $150M vol
...
0.023% for $4B+ vol

Ref.

Analysis:

for the baseline level of 0.1%, it make be hard to make $. But we can test
we can hit the 0.06% fee readily with conservative volumes. And only do better yet from there.

Binance futures fees

0.05% for <$15M vol
0.04% for $15M - $50M vol
...
0.004% for $5B vol - $12.5B vol
0.0017% for $25B+ vol

Ref.

Analysis:

baseline fees are just 50% of binance spot
the baseline level of 0.05% is already good. Definitely test though. And ofc I'd have to build support.
we can hit the 0.04% fee readily with conservative volumes. And only do better yet from there.

Hyperliquid fees

0.035% for <$5M vol
0.030% for $5M - $25M vol
0.025% for $25M - $100M vol
...
0.019% for $2B+ vol

Ref

Analysis:

baseline fees are just 35% of binance spot, and 70% of binance futures
the baseline level of 0.035% is already great. Definitely test though. And ofc I'd have to build support.
we can hit the 0.025% fee readily with conservative volumes. And only do better yet from there.

trentmc · 2025-01-26T08:52:02Z

Results on nonzero fees; rest the same

Setup: like comment 2, 2 except - different fee. Includes these params:

confidence_threshold: 0.75 (75%)
stop_loss_percent: 0.001 (0.1%) (like comment 2)
take_profit_percent: 0.001 (0.1%) ("")

Like before: 90d run. $1000 in. APY = (1+$MADE/90/1000)^365-1 = FOO = BAR%.

Convergence curve shown for each. Terminate early if obviously doing badly.

Results: 0.1% fee. Lost $

Results: 0.025% fee. Lost $

Results: 0.01% fee. Made $102, APY = 51.1%

This is a lower bound asking "is it possible to make $ at all, with current setup, under any fees". But note that these fees are 2x lower than Binance or HL lowest-possible fees.

trentmc · 2025-01-26T12:05:49Z

Results on 0.025% fee

Q: on 0.025% fees (HL moderate rate), make $ when params tuned?

Baseline setup from prev comment, where fee = 0.025%.

Tests at: (conf=75%, 90%) x (SL/TP=0.1%, 0.15%, 0.2%)

where conf = confidence_threshold, SL/TP = stop_loss_percent = take_profit_percent

Like before: 90d run. $1000 in. APY = (1+$MADE/90/1000)^365-1 = FOO = BAR%.

conf=75%, SL/TP=0.10%. Lost $

conf=75%, SL/TP=0.15%. Lost $

conf=75%, SL/TP=0.2%. Lost $

conf=90%, SL/TP=0.10%. Made $26.68, APY = 11.42%

conf=90%, SL/TP=0.15%. Made $33.72, APY = 14.65%

conf=90%, SL/TP=0.20%. Made $46.09, APY = 20.55%

conf=95%, SL/TP=0.20%. Made $31.97, APY = 13.84%

Conclusion

I can now make $ on 0.025% fee, from mild tuning (conf=95%, SL/TP=0.20%). Yay! Full settings: my_ppss.yaml

Con: $ is not significant (20.5% APY); and 0.025% fee will only happen at >$25M volume on HL.
Pro: there's more space for tuning. 20.5% is the worst it will ever be.

Possible next steps:

Q: Can I make $ 0.035% fee (HL starter volume)? If needed, do mild tuning
Q: Can I make better % at $0.025% fee? Via more tuning. What if SL/TP is 0.025%, 0.030%, ..., 0.0050%. Tuning input data, model params, etc.
Improve code: simpler codebase, way faster sim. Do this first, it will help the rest in spades

Appendix: baseline results for standardized dates

On conf=90%, SL/TP=0.20%: made $46.09, APY = 20.54%

Detailed settings: (the same as performance profiling below; on newly tdr-focused code, this commit)

lake: binance BTC 5m; st=2024-9-28, fin=2025-01-28 (the standardized dates)
sim: test_n=1000
aimodel: max_n_train=5700 (20d), autoregressive_n=12, ClassifLinearRidge, weight_recent=10x_5x, balance_classes=None, calibrate_probs=CalibratedClassifierCV_Sigmoid, train_every_n_epochs=1
trader: buy_amt=1000 USD, fee_percent=0.025%, conf=90%, SL/TP=0.2%

trentmc · 2025-01-28T07:58:30Z

Performance profiling, test ensembles

Explores:

Q: Can I speed things up? (A: not much)
Q: Can I make more $ with different ensemble params? (A: not much)

Settings:

lake: binance BTC 5m; st=2024-9-28, fin=2025-01-28
sim: test_n=1000
aimodel: max_n_train=5700 (20d), autoregressive_n=12, ClassifLinearRidge, weight_recent=10x_5x, balance_classes=None, calibrate_probs=CalibratedClassifierCV (Sigmoid, cv=5, ensemble=True), train_every_n_epochs=1
trader: buy_amt=1000 USD, fee_percent=0.025%, conf=90%, SL/TP=0.2%

Usage:

Calc runtime / profile, on shorter sim:

Run: python -m cProfile -o profile.stats pdr sim my_ppss_profiling.yaml
View results: view_stats
Update table: "profiling: runtime"

Calc profit, on longer sim:

Run: pdr sim my_ppss_makemoney.yaml
View cumul profit on last output line
Update table: "makemoney: profit"

Results: Perf profiling

Q: Can I speed things up? A: not much

ID	profiling: runtime	makemoney: profit	Label / change from prev	Success?
1	39.91	skip	main branch baseline	N/A
2	36.67	$46.09	from (1), new tdr branch. No predictoor, no model stats	Y
3	17.48	-$102.79	from (2), train_every_n_epochs=10	N
4	43.04	skip	from (2), conf=0.9, CalibratedClassifierCV cv=5, ensemble=F	N

Results: test ensembles

Q: Can I make more $ with different ensemble params? A: not much

ID	profiling: runtime	makemoney: profit	Label / change from prev	Success?
2	36.67	$46.09	from (1), new tdr branch. No predictoor, no model stats	Y
4	43.04	skip	from (2), conf=0.9, CalibratedClassifierCV cv=5, ensemble=F	N
5	skip	-$23 at 10Kep	from (2), conf=0.9, CalibratedClassifierCV cv=10, ensemble=T	N
6	skip	-$42 at 4Kep	from (2), conf=0.75, CalibratedClassifierCV cv=10, ensemble=T	N
7	skip	-$8 at 2.5Kep	from (2), conf=0.75, CalibratedClassifierCV cv=100, ensemble=T	N
8	skip	-$8 at 2.5Kep	from (2), conf=0.6, CalibratedClassifierCV cv=100, ensemble=T	N

trentmc · 2025-01-28T15:48:14Z

Try idea: Center model on most recently seen price

The problem: model spends too much modeling effort true/false for different close prices. It would have a simpler modeling task if the "close price" to model was always zero; and then it just needs to output a value > 0 ("up") or < 0 ("down").

How: make output y to be an "amt of change wrt prev close" (can be a %, or absolute difference)

To prototype: I implemented this in sim_engine.py. Here's the diff, and the updated sim_engine.py.

Result: Init runs

I tried some quick experiments with linear models. Basically random. Then I tried GPM classifier. Maybe good, but super-slow. Then I tried xgboost. Slower than lin, much faster than GPM, I moved forward with it. The results below are xgboost.

Result: xgboost, fees=0. Made $963.01

Summary: Made $963.01, APY = 4766% (!). Calc: (1+963.01/90/1000)^365-1 = 47.657

Iter #26275/26275 ut=1738022400000 dt=2025-01-28_00:00 ║ prob_up=0.218 conf_up=0.000 conf_down=0.564 ║ # trades=10212 ║ tdr_profit=$ 0.00 (cumul $963.01)

Detailed settings:

lake: binance BTC 5m; st=2024-9-28, fin=2025-01-28 (the standardized dates)
sim: test_n=26275 epochs (3 mos)
aimodel: max_n_train=5700 (20d), autoregressive_n=12, ClassifXgboost, weight_recent=None, balance_classes=None, calibrate_probs=None, train_every_n_epochs=1
trader: buy_amt=1000 USD, fee_percent=0.0%, conf=0%, SL/TP=0.2%
All settings: my_ppss.yaml.txt

Full log: out_1738066991328.txt

Result: xgboost, fees=0.025%, conf=0, SL/TP=0.2%, max_n_train=5700 (20d). Lost $439 after 1.7K ep, $2.7K after 12K ep

Other settings like ^.

Iter #1739/26275 ut=1730661600000 dt=2024-11-03_19:20 ║ prob_up=0.887, conf=0.773, 49.7% correct ( 51.5% on trades) ║ # trades= 569 ( 32.72%) ║ tdr_profit=$ 0.00 (cumul $-439.56)
...
Iter #12058/26275 ut=1733757300000 dt=2024-12-09_15:15 ║ prob_up=0.855, conf=0.710, 49.8% correct ( 50.2% on trades) ║ # trades=4330 ( 35.91%) ║ tdr_profit=$ 0.00 (cumul $-2695.11)

Log: out_1738151337210.txt

Result: xgboost, fees=0.025%, conf=75%, SL/TP=0.2%, max_n_train=5700 (20d). Lost $405 after 2K ep

Other settings like ^. Iter #1999/26275 ut=1730739600000 dt=2024-11-04_17:00 ║ prob_up=0.785 conf_up=0.569 conf_down=0.000 ║ # trades= 554 ║ tdr_profit=$ -4.84 (cumul $-405.47)

Result: xgboost, fees=0.025%, conf=95%, SL/TP=0.2%, max_n_train=5700 (20d). Lost $335 after 3.3K ep

Other settings like ^. Iter #3356/26275 ut=1731146700000 dt=2024-11-09_10:05 ║ prob_up=0.643 conf_up=0.285 conf_down=0.000 ║ # trades= 387 ║ tdr_profit=$ 0.00 (cumul $-336.09)

Result: xgboost, fees=0.025%, conf=95%, SL/TP=0.2%, max_n_train=8850 (30d). Lost $587 after 8.6K ep

Other settings like ^. Iter #8260/26275 ut=1732617900000 dt=2024-11-26_10:45 ║ prob_up=0.910 conf_up=0.820 conf_down=0.000 ║ # trades=1125 ║ tdr_profit=$ 0.00 (cumul $-587.32)

Log: out_1738091738527.txt

trentmc · 2025-02-03T07:26:21Z

Current status:

On 0.025% fees, I haven't been able to make $ yet
Next step: complete fast model test/tune flow.

But, I will pause this for now to spend time on yiedl bot. Benefits could flow both ways:

ideas from yiedl flow help this
the yiedl codebase isn't very mature. If we ported its core algs to this codebase, it could get super-mature, quickly

trentmc added the Type: Enhancement New feature or request label Jan 25, 2025

This was referenced Jan 25, 2025

[Model/trade] Experiment: Reframe to classify 'will price go up *anytime* in next 5 min?' #1278

Closed

[EPIC, Sim] Sim predicts/trades >1 feed #1264

Closed

trentmc changed the title ~~[EPIC] Make $ trading across 500+ tokens; 5m; trade rarely per token. Sim, and live~~ [EPIC] Make $ trading across 500 tokens; 5m; trade rarely per token. Sim, and live Jan 25, 2025

trentmc mentioned this issue Jan 25, 2025

fix #1716, #1717 (failing tests); towards #1713 (copyright) #1715

Merged

trentmc mentioned this issue Jan 27, 2025

'tdr' branch to reduce complexity, speed runtime. For #1712 #1721

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EPIC] Make $ trading across 500 tokens; 5m; trade rarely per token. Sim, and live #1712

[EPIC] Make $ trading across 500 tokens; 5m; trade rarely per token. Sim, and live #1712

trentmc commented Jan 25, 2025 •

edited

Loading

trentmc commented Jan 26, 2025 •

edited

Loading

trentmc commented Jan 26, 2025 •

edited

Loading

trentmc commented Jan 26, 2025 •

edited

Loading

trentmc commented Jan 26, 2025 •

edited

Loading

trentmc commented Jan 26, 2025 •

edited

Loading

trentmc commented Jan 28, 2025 •

edited

Loading

trentmc commented Jan 28, 2025 •

edited

Loading

trentmc commented Feb 3, 2025

[EPIC] Make $ trading across 500 tokens; 5m; trade rarely per token. Sim, and live #1712

[EPIC] Make $ trading across 500 tokens; 5m; trade rarely per token. Sim, and live #1712

Comments

trentmc commented Jan 25, 2025 • edited Loading

Background / motivation

Phases / tasks

Completed phases / tasks

Appendix: things to add, only if we need, to make $

Appendix: Fees

trentmc commented Jan 26, 2025 • edited Loading

Hypothesis: "without the following model req'ts, model has no hope":

trentmc commented Jan 26, 2025 • edited Loading

Results: 0 fees. Making $

Further results

trentmc commented Jan 26, 2025 • edited Loading

On fees

Our expected trade volumes

Binance spot fees

Binance futures fees

Hyperliquid fees

trentmc commented Jan 26, 2025 • edited Loading

Results on nonzero fees; rest the same

Results: 0.1% fee. Lost $

Results: 0.025% fee. Lost $

Results: 0.01% fee. Made $102, APY = 51.1%

trentmc commented Jan 26, 2025 • edited Loading

Results on 0.025% fee

conf=75%, SL/TP=0.10%. Lost $

conf=75%, SL/TP=0.15%. Lost $

conf=75%, SL/TP=0.2%. Lost $

conf=90%, SL/TP=0.10%. Made $26.68, APY = 11.42%

conf=90%, SL/TP=0.15%. Made $33.72, APY = 14.65%

conf=90%, SL/TP=0.20%. Made $46.09, APY = 20.55%

conf=95%, SL/TP=0.20%. Made $31.97, APY = 13.84%

Conclusion

Appendix: baseline results for standardized dates

trentmc commented Jan 28, 2025 • edited Loading

Performance profiling, test ensembles

trentmc commented Jan 28, 2025 • edited Loading

Try idea: Center model on most recently seen price

Result: Init runs

Result: xgboost, fees=0. Made $963.01

Result: xgboost, fees=0.025%, conf=0, SL/TP=0.2%, max_n_train=5700 (20d). Lost $439 after 1.7K ep, $2.7K after 12K ep

Result: xgboost, fees=0.025%, conf=75%, SL/TP=0.2%, max_n_train=5700 (20d). Lost $405 after 2K ep

Result: xgboost, fees=0.025%, conf=95%, SL/TP=0.2%, max_n_train=5700 (20d). Lost $335 after 3.3K ep

Result: xgboost, fees=0.025%, conf=95%, SL/TP=0.2%, max_n_train=8850 (30d). Lost $587 after 8.6K ep

trentmc commented Feb 3, 2025

trentmc commented Jan 25, 2025 •

edited

Loading

trentmc commented Jan 26, 2025 •

edited

Loading

trentmc commented Jan 26, 2025 •

edited

Loading

trentmc commented Jan 26, 2025 •

edited

Loading

trentmc commented Jan 26, 2025 •

edited

Loading

trentmc commented Jan 26, 2025 •

edited

Loading

trentmc commented Jan 28, 2025 •

edited

Loading

trentmc commented Jan 28, 2025 •

edited

Loading