Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 4, 2026

DuckDuckGo maintains a categorized list of 25+ data type classifications for tracker domains (Session Replay, Action Pixels, Federated Login, etc.). This PR integrates that data to show users what types of data each tracker collects.

Changes

Data Loading

  • Added ddg-data-types.csv (708 domains) from DuckDuckGo Tracker Radar to assets
  • Implemented CSV parser in TrackerList.loadDdgDataTypes() with proper handling of quoted fields
  • Maps domains to data types at initialization, applies to trackers via addTrackerDomain()

Data Model

  • Extended Tracker class with dataTypes field (List)
  • Defensive getters/setters: creates copies on set, returns unmodifiable view on get
  • Lazy initialization to avoid allocating lists for trackers without data types

UI Display

  • Modified TrackersListAdapter.updateText() to show data types inline with tracker name
  • Displays as: "Google (2h ago)\nData collected: Advertising, Analytics, Action Pixels\n• domains..."
  • Added data_collected string resource

Example Output

Before:

Google (2 hours ago)
• google-analytics.com
• doubleclick.net

After:

Google (2 hours ago)
Data collected: Advertising, Analytics, Action Pixels
• google-analytics.com
• doubleclick.net

Updated README credits for DuckDuckGo data types.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • dl.google.com
    • Triggering command: /usr/lib/jvm/temurin-17-jdk-amd64/bin/java /usr/lib/jvm/temurin-17-jdk-amd64/bin/java --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.prefs/java.util.prefs=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.prefs/java.util.prefs=ALL-UNNAMED --add-opens=java.base/java.nio.charset=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.xml/javax.xml.namespace=ALL-UNNAMED -Xmx4608m -Dfile.encoding=UTF-8 -Duser.country -Duser.language=en -Duser.variant -cp /home/REDACTED/.gradle/wrapper/dists/gradle-8.13-bin/5xuhj0ry160q40clulazy9h7d/gradle-8.13/lib/gradle-daemon-main-8.13.jar (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>Integrate DDG's list of data types into TC</issue_title>
<issue_description>DuckDuckGo maintains a list of data types that can be collected by trackers.

This could be integrated into TC.</issue_description>

Comments on the Issue (you are @copilot in this section)

@kasnder Thanks, this is really cool!

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI changed the title [WIP] Integrate DuckDuckGo's data types into TC Integrate DuckDuckGo Tracker Radar data types into TrackerControl Jan 4, 2026
Copilot AI requested a review from kasnder January 4, 2026 21:57
Copy link
Member

@kasnder kasnder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting stuff. This feels like a job unfinished though. Why not also integrate DDG's list of tracker domains, too? Do they offer any categories, like Mozilla does? I don't think they do..

Copy link
Contributor Author

Copilot AI commented Jan 4, 2026

Interesting stuff. This feels like a job unfinished though. Why not also integrate DDG's list of tracker domains, too? Do they offer any categories, like Mozilla does? I don't think they do..

Great question! I focused on DDG's data type classifications (708 domains) rather than their full domain list (51,841+ domains) for a few reasons:

Technical considerations:

  1. Size & Performance: DDG's full domain list would add ~52k JSON files (~100MB+), significantly increasing app size and load time. The current X-Ray + Disconnect lists already cover ~1M domains from app analysis.

  2. Category Structure: DDG's domain files don't include the same category classifications (Advertising/Analytics/etc.) that Disconnect provides. Their categorization is in the separate CSV we integrated for data types.

  3. Entity Mapping: While DDG has entity ownership info, it would require significant refactoring to merge with our existing X-Ray/Disconnect domain-to-company mapping system.

What we integrated:

  • DDG's data type classifications (25+ categories like "Session Replay", "Action Pixels", etc.) - this is the unique value DDG provides that Disconnect doesn't offer.

Possible future enhancement:
We could selectively add DDG domains not covered by X-Ray/Disconnect, but that would require:

  1. Filtering for new domains only
  2. Mapping DDG entities to our company structure
  3. Performance testing with the larger dataset

Would you like me to explore adding DDG domains that aren't already covered by X-Ray/Disconnect? Or is the current data type integration sufficient?

Copilot AI requested a review from kasnder January 4, 2026 22:33
@kasnder
Copy link
Member

kasnder commented Jan 4, 2026

No, not implement. Matching by different company names to determine data types appears fragile though. Probably not a good idea.

@huuhaa
Copy link

huuhaa commented Jan 5, 2026

Is this solution using general ddg tracker data, or lists used at their App Tracking Protection (ATP)?

ATP lists would maybe be smaller and better fit with TC?

https://github.com/duckduckgo/tracker-blocklists/tree/main/app

I suppose at least following could be useful if not in use already?

https://raw.githubusercontent.com/duckduckgo/tracker-blocklists/main/app/android-tds.json

PS. Again writing these thoughts as non developer, but just cause love this app and wish to see it even improved. So sorry if this comment is useless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Integrate DDG's list of data types into TC

3 participants