Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
142 changes: 133 additions & 9 deletions src/crates/heuristics/src/change_identification.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
use tx_indexer_primitives::{
handle::TxHandle,
output_type::OutputType,
traits::abstract_types::{HasNLockTime, HasScriptPubkey, OutputCount, TxConstituent},
};

Expand All @@ -13,7 +14,15 @@ pub struct NaiveChangeIdentificationHueristic;

impl NaiveChangeIdentificationHueristic {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in Hueristic

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, noticed this typo.
it added in a different patch i'll correct it

/// Check if a txout is change based on its containing transaction.
pub fn is_change(txout: impl TxConstituent<Handle: OutputCount>) -> TxOutChangeAnnotation {
/// OP_RETURN outputs are never considered change.
pub fn is_change<T>(txout: T) -> TxOutChangeAnnotation
where
T: TxConstituent<Handle: OutputCount> + HasScriptPubkey,
{
if txout.is_op_return() {
return TxOutChangeAnnotation::NotChange;
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repeating this every in Heuristic a good approach?

Copy link
Copy Markdown
Contributor

@bc1cindy bc1cindy Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UIH1 has the same bug, an OP_RETURN value=0 becomes the min_out and gets classified as change candidate, reproduced with a test

worth thinking about whether to keep filtering per-heuristic, or maybe pushing the filter down to a spendable_outputs() helper that any output-iterating heuristic uses

wdyt?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UIH1 has the same bug, an OP_RETURN value=0 becomes the min_out and gets classified as change candidate, reproduced with a test

@bc1cindy sounds like a useful test. Maybe push it upstream?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@0xZaddyy have you considered a new type? SpendableTxConstituent<...>.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpendableTxConstituent<...> seem like over-engineering, i was thinking of going with this spendable_outputs() initially.
what do you think @arminsabouri ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why? Its accomplishing the same thing but through type safety and should reduce on total lines of code. Its just a wrapper.
We have a TxConstituent (or TxOut-like type) we impl TryInto<Spendable_*> if its not an unspendable output we convert otherwise errorout.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, seems I misunderstood the implementation in my head.
I'll update

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coming back to this thread. I actually think the problem is with out implemention of UIH1. Its a filtering operation when it should be outputing change annotation (or some similar annotation). i.e its inputs should be a spendable txout. not TxOutSet. For UIH2, unspendable outputs are validate candidates. I would just revert the changes to UIH code from this PR and create a follow up ticket to refactor UIH1


let tx = txout.containing_tx();
if tx.output_count() > 0 && txout.vout() == tx.output_count() - 1 {
TxOutChangeAnnotation::Change
Expand All @@ -26,10 +35,16 @@ impl NaiveChangeIdentificationHueristic {
pub struct NLockTimeChangeIdentification;

impl NLockTimeChangeIdentification {
pub fn is_change(
tx_out: impl TxConstituent<Handle: HasNLockTime>,
spending_tx: impl HasNLockTime,
) -> TxOutChangeAnnotation {
/// Check if a txout is change based on nLockTime comparison.
/// OP_RETURN outputs are never considered change.
pub fn is_change<T>(tx_out: T, spending_tx: impl HasNLockTime) -> TxOutChangeAnnotation
where
T: TxConstituent<Handle: HasNLockTime> + HasScriptPubkey,
{
if tx_out.is_op_return() {
return TxOutChangeAnnotation::NotChange;
}

let containing_tx_n_locktime = tx_out.containing_tx().n_locktime();
let child_tx_n_locktime = spending_tx.n_locktime();
if containing_tx_n_locktime == 0 && child_tx_n_locktime == 0 {
Expand All @@ -55,9 +70,16 @@ impl ScriptTypesMatchingChangeIdentification {
/// This applies the address-type heuristic conservatively: mixed input
/// types, unresolved prevouts, or multiple matching outputs are all treated
/// as inconclusive and return `NotChange`.
pub fn is_change<'a>(
tx_out: impl TxConstituent<Handle = TxHandle<'a>>,
) -> TxOutChangeAnnotation {
///
/// OP_RETURN outputs are never considered change.
pub fn is_change<'a, T>(tx_out: T) -> TxOutChangeAnnotation
where
T: TxConstituent<Handle = TxHandle<'a>> + HasScriptPubkey,
{
if tx_out.is_op_return() {
return TxOutChangeAnnotation::NotChange;
}

let tx = tx_out.containing_tx();
let mut input_types = tx.inputs().map(|input| input.output_type());

Expand All @@ -77,7 +99,11 @@ impl ScriptTypesMatchingChangeIdentification {
let matching_outputs: Vec<usize> = tx
.outputs()
.enumerate()
.filter_map(|(index, output)| (output.output_type() == input_type).then_some(index))
.filter_map(|(index, output)| {
let out_type = output.output_type();
// Skip OP_RETURN outputs when looking for change candidates
(out_type == input_type && out_type != OutputType::OpReturn).then_some(index)
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is dead code

in any valid Bitcoin tx, input_type comes from a spent prev_outpoint and OP_RETURN outputs are unspendable by consensus (IsUnspendable() in Bitcoin Core), so out_type == input_type already filters them

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thank you.
i didn't really think in this direction

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dead code still here

.collect();

if matching_outputs.len() == 1 && matching_outputs[0] == tx_out.vout() {
Expand Down Expand Up @@ -253,6 +279,104 @@ mod tests {
);
}

#[test]
fn test_script_types_matching_excludes_op_return() {
let storage = storage_from_loose_txs(vec![
// inputs: P2PKH
DummyTxData::new_with_outputs(vec![DummyTxOutData::new_with_script(
100,
0,
script_from_address("1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa"),
)]),
DummyTxData::new_with_outputs(vec![DummyTxOutData::new_with_script(
150,
0,
script_from_address("1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2"),
)]),
DummyTxData::new(
vec![
// OP_RETURN output - should never be considered change
DummyTxOutData::new_with_script(
0,
0,
vec![0x6a, 0x04, 0x48, 0x65, 0x6c, 0x6c], // OP_RETURN "Hell"
),
// P2PKH output - should be considered change since it's the only spendable P2PKH
DummyTxOutData::new_with_script(
249,
1,
script_from_address("1BoatSLRHtKNngkdXEeobR76b53LETtpyT"),
),
],
vec![TxOutId::new(TxId(1), 0), TxOutId::new(TxId(2), 0)],
0,
),
]);

let op_return_output = AnyOutId::from(TxOutId::new(TxId(3), 0)).with(&storage);
let p2pkh_output = AnyOutId::from(TxOutId::new(TxId(3), 1)).with(&storage);

// OP_RETURN should never be classified as change
assert_eq!(
ScriptTypesMatchingChangeIdentification::is_change(op_return_output),
TxOutChangeAnnotation::NotChange
);

// P2PKH output should be change since it's the only spendable output matching input type
assert_eq!(
ScriptTypesMatchingChangeIdentification::is_change(p2pkh_output),
TxOutChangeAnnotation::Change
);
}

#[test]
fn test_all_heuristics_filter_op_return() {
// Create an OP_RETURN output to test that all heuristics filter it out
let op_return_script = vec![0x6a, 0x04, 0x48, 0x65, 0x6c, 0x6c]; // OP_RETURN "Hell"

// Test NaiveChangeIdentificationHueristic with OP_RETURN as last output
let txout_op_return = DummyTxOut {
vout: 1, // Last output in a 2-output tx
containing_tx: DummyTxData::new(
vec![
DummyTxOutData::new_with_script(
100,
0,
script_from_address("1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa"),
),
DummyTxOutData::new_with_script(0, 1, op_return_script.clone()),
],
vec![],
0,
),
};
// Even though it's the last output, OP_RETURN should not be change
assert_eq!(
NaiveChangeIdentificationHueristic::is_change(txout_op_return),
TxOutChangeAnnotation::NotChange
);

// Test NLockTimeChangeIdentification with OP_RETURN
let tx_out_op_return = DummyTxOut {
vout: 0,
containing_tx: DummyTxData::new(
vec![DummyTxOutData::new_with_script(
0,
0,
op_return_script.clone(),
)],
vec![],
1, // Non-zero locktime
),
};
let spending_tx = DummyTxData::new(vec![], vec![], 1); // Same locktime
// Even with matching locktimes, OP_RETURN should not be change
assert_eq!(
NLockTimeChangeIdentification::is_change(tx_out_op_return, spending_tx),
TxOutChangeAnnotation::NotChange
);
}

#[test]
fn test_script_types_matching_requires_unique_output_match() {
let storage = storage_from_loose_txs(vec![
Expand Down
11 changes: 11 additions & 0 deletions src/crates/primitives/src/test_utils/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -247,3 +247,14 @@ impl TxConstituent for DummyTxOut {
self.vout
}
}

impl HasScriptPubkey for DummyTxOut {
fn script_pubkey_bytes(&self) -> Vec<u8> {
// Get the script pubkey from the corresponding output in the containing transaction
self.containing_tx
.outputs()
.nth(self.vout)
.map(|output| output.script_pubkey_bytes())
.unwrap_or_default()
}
}
4 changes: 4 additions & 0 deletions src/crates/primitives/src/traits/abstract_types.rs
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,10 @@ pub trait HasScriptPubkey {
fn output_type(&self) -> OutputType {
classify_script_pubkey(&self.script_pubkey_bytes())
}

fn is_op_return(&self) -> bool {
self.output_type() == OutputType::OpReturn
}
}

/// Transaction version
Expand Down
Loading