Skip to content

Conversation

@jke000
Copy link
Collaborator

@jke000 jke000 commented Jan 6, 2026

Revert to making .idx files independent i.e. there's no need for .fasta anymore for peptide index and FI searches once the . idx has been created. This entails writing the protein description info and flanking AA into the .idx, both of which were previously parsed from the .fasta. Also rename Comet's "Scores" struct to "CometScores" as there's some new conflict in Crux code with a "Scores" class in Percolator.

…ASTA is no longer needed to run FI searches. The flanking amino acid residues, which were previously parsed on-the-fly from the FASTA, are no longer reported.
…on indexing) now store protein description text in them, making them self-sufficient for searching. The original FASTA is no longer needed; previous the flanking AA residues and protein accessions were dynamically parsed from the corresponding FASTA. NOTE: flanking AA residues are currently not stored and reported in the .idx searches; I'll work on adding these to the .idx files next. Also MzIdentML output code needs fixing with respect to .idx searches; since the original FASTA is no longer required, I need to properly handle things like exporting the DBSequence element only if the FASTA is present.
…y previously parsed on-the-fly from the FASTA each time after each search (which is pretty horrid from a performance perspective). Now that the FASTA is no longer needed, this info needs to be stored in the .idx. Next, I need to add this to the the old peptide index.
…"struct ScoresMS1" to "struct CometScores"/"struct CometScoresMS1" to avoid some "Scores" naming collision issue with Percolator "Scores" and (2) add parenthesis around "(std::numeric_limits<unsigned int>::max)()" to avoid the "max" macro expansion in Windows.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes .idx files independent so that .fasta files are no longer required after the .idx file has been created. It achieves this by storing protein description information and flanking amino acids directly in the .idx file. Additionally, it renames Comet's Scores struct to CometScores to resolve a naming conflict with Percolator's Scores class in Crux code.

Key Changes:

  • Added flanking amino acid storage (cPrevAA/cNextAA) to indexed peptides so they don't need to be retrieved from FASTA
  • Introduced g_pvProteinNames map to store protein names in the .idx file
  • Added g_bIdxNoFasta flag to handle .idx searches without requiring the corresponding .fasta file
  • Renamed Scores/ScoresMS1 structs to CometScores/CometScoresMS1 throughout the codebase

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
RealtimeSearch/SearchMS1MS2.cs Testing/debug configuration changes (iPrintEveryScan, bPerformMS1Search)
CometWrapper/CometWrapper.cpp Updated to use renamed CometScores and CometScoresMS1 types
CometWrapper/CometDataWrapper.h Updated wrapper classes to use renamed score types
CometSearch/CometWriteMzIdentML.cpp Added logic to skip sequence retrieval when g_bIdxNoFasta is true; updated protein output for indexed databases
CometSearch/CometSearchManager.h Updated method signatures to use renamed score types
CometSearch/CometSearchManager.cpp Major refactoring of file handling to support optional FASTA; removed unused fpfasta references; added g_bIdxNoFasta flag management
CometSearch/CometSearch.h Reorganized includes (moved to header from .cpp)
CometSearch/CometSearch.cpp Added flanking AA calculation during index building and scoring; includes debug code that needs removal
CometSearch/CometPeptideIndex.cpp Modified to write protein names and flanking AAs to .idx file; includes memory leak
CometSearch/CometMassSpecUtils.cpp Simplified protein name retrieval; contains disabled GetPrevNextAA function
CometSearch/CometInterfaces.h Updated interface to use renamed score types
CometSearch/CometFragmentIndex.h Renamed WritePlainPeptideIndex to WriteFIPlainPeptideIndex
CometSearch/CometFragmentIndex.cpp Modified to write protein names and flanking AAs to fragment index; includes memory leak and incomplete implementation
CometSearch/CometDataInternal.h Added cPrevAA/cNextAA fields to DBIndex and PlainPeptideIndexStruct; added g_pvProteinNames map; contains logic error in operator<
CometSearch/CometData.h Renamed Scores to CometScores and ScoresMS1 to CometScoresMS1

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…decoy peptides (relevant only for peptide indexing)
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

CometSearch/CometData.h:192

  • The assignment operator should take a const reference (const CometScoresMS1&) instead of a non-const reference (CometScoresMS1&). This is the standard C++ convention for copy assignment operators.
   CometScoresMS1& operator=(CometScoresMS1& a)
   {
      fDotProduct = a.fDotProduct;
      fRTime = a.fRTime;
      iScanNumber = a.iScanNumber;
      return *this;
   }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jke000 jke000 merged commit 54a9ded into master Jan 7, 2026
11 checks passed
@jke000 jke000 deleted the FI-no_fasta branch January 7, 2026 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants