Always delete `raw` Colabfold folder before processing

Hi Team,

This is not an issue per se but  is definitely a strong suggestion by an end user. It's possible for an openfold job to fail for N number of reasons and a lot of the times this happens before cleanup: i.e. the `raw` colabfold output folder and stale out.tar.gz files remain in the MSA dir. Subsequent runs skip hitting the Colabfold API because of 


```python
`if not os.path.isfile(tar_gz_file):` 
```  
in core/data/tools/colabfold_msa_server.py 



which leads to the pipeline using older colabfold raw output folder for the new query as this check only checks file existence, not whether it matches the current query. This results in nondescript downstream errors which are very difficult for the end user to investigate.

This check for tar_gz_file is redundant anyway because if the query runs properly, this tar_gz_file is deleted along with the entire raw folder. On rerun, the pipeline always calls the API again.

My suggestion is to always delete the `raw` colabfold folder before this line runs. What do you think? Thanks! 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Always delete `raw` Colabfold folder before processing #39

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Always delete raw Colabfold folder before processing #39

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Always delete `raw` Colabfold folder before processing #39