The full memory set can be downloaded at here and the full evaluation set can be downloaded at here.
In the folder Memory Set, we provide our memory set that be used to inject private information into MLLMs by fine-tuning. database_full.xlsx contains the database information and final_data_full.json contains the labels for fine-tuning MLLMs. We also provide two sample codes that using our dataset to do supervised fine-tuning with Idefics2 and Xgen3 on our Memory Set.
In the folder Evaluation, we provide our evaluation set for Direct Output Test and Memory Output Test. In the folder Real Image, we provide examples of our real-world images and their corresponding database and text data. Again we provide codes that use Xgen3 to generate responses to 5 different tasks with the evaluation set for Direct Disclosure Risks and Retention Risks, respectively.