Expose Telegram contact export command#1
Closed
joshp123 wants to merge 12 commits into
Closed
Conversation
LC Tele PR: complete media archive handling
What: - store decoded Postbox web preview, location, poll, and service-action metadata on messages - keep non-file Telegram objects out of binary media rows while preserving structured message metadata - require remote media candidates to carry file resource IDs before --fetch-media probes them Why: - preserve source-native Telegram metadata for Lifecrawler without inflating media counts - make placeholder rows queryable as metadata instead of treating them as missing archived files - keep repeat --fetch-media focused on actual file resources Tests: - python3 -m py_compile internal/telegramdesktop/scripts/import_postbox.py - python3 internal/telegramdesktop/scripts/import_postbox.py --self-test --fixture-dir internal/telegramdesktop/testdata/postbox - gofumpt v0.9.2 -l . - golangci-lint run - go vet ./... - staticcheck v0.7.0 ./... - gosec -exclude=G101,G115,G202,G301,G304 ./... - go test -count=1 ./... -coverprofile=coverage.out - go test -count=1 -race ./... - ./scripts/coverage.sh 35.0 - go build ./cmd/telecrawl - go mod verify - govulncheck ./... - go mod tidy && git diff --exit-code -- go.mod go.sum - git diff --check && git diff --cached --check - goreleaser release --snapshot --clean --skip=publish - gitleaks git --no-banner --redact - gitleaks dir . --no-banner --redact
What: - add a schema-2 upgrade smoke for message metadata columns - document metadata_json as local source-native Postbox metadata, not a cross-source schema Why: - answer ClawSweeper's upgrade-smoke merge risk on the metadata PR - make the durable metadata_json contract explicit before maintainer review Tests: - python3 -m py_compile internal/telegramdesktop/scripts/import_postbox.py - python3 internal/telegramdesktop/scripts/import_postbox.py --self-test --fixture-dir internal/telegramdesktop/testdata/postbox - gofumpt v0.9.2 -l . - golangci-lint run - go vet ./... - staticcheck v0.7.0 ./... - gosec -exclude=G101,G115,G202,G301,G304 ./... - go test -count=1 ./... -coverprofile=coverage.out - go test -count=1 -race ./... - ./scripts/coverage.sh 35.0 - go build ./cmd/telecrawl - go mod verify - govulncheck ./... - go mod tidy && git diff --exit-code -- go.mod go.sum - git diff --check && git diff --cached --check - goreleaser release --snapshot --clean --skip=publish - gitleaks git --no-banner --redact - gitleaks dir . --no-banner --redact
…tadata LC Tele PR: archive message metadata
* feat: archive postbox message metadata What: - store decoded Postbox web preview, location, poll, and service-action metadata on messages - keep non-file Telegram objects out of binary media rows while preserving structured message metadata - require remote media candidates to carry file resource IDs before --fetch-media probes them Why: - preserve source-native Telegram metadata for Lifecrawler without inflating media counts - make placeholder rows queryable as metadata instead of treating them as missing archived files - keep repeat --fetch-media focused on actual file resources Tests: - python3 -m py_compile internal/telegramdesktop/scripts/import_postbox.py - python3 internal/telegramdesktop/scripts/import_postbox.py --self-test --fixture-dir internal/telegramdesktop/testdata/postbox - gofumpt v0.9.2 -l . - golangci-lint run - go vet ./... - staticcheck v0.7.0 ./... - gosec -exclude=G101,G115,G202,G301,G304 ./... - go test -count=1 ./... -coverprofile=coverage.out - go test -count=1 -race ./... - ./scripts/coverage.sh 35.0 - go build ./cmd/telecrawl - go mod verify - govulncheck ./... - go mod tidy && git diff --exit-code -- go.mod go.sum - git diff --check && git diff --cached --check - goreleaser release --snapshot --clean --skip=publish - gitleaks git --no-banner --redact - gitleaks dir . --no-banner --redact * feat: archive telegram contacts What: - extract Telegram peer records as contacts for message enrichment - store peer type, display names, usernames, phone numbers, and cached avatar paths when safely archived - add contacts export/import and a contacts read command Why: - make message sender/chat IDs enrichable with Telegram-local person/contact context - preserve source-native contact fields without launching Telegram or starting login flows - keep cached avatar files local-first by copying them into the Telecrawl media archive before storing paths Tests: - python3 -m py_compile internal/telegramdesktop/scripts/import_postbox.py - python3 internal/telegramdesktop/scripts/import_postbox.py --self-test --fixture-dir internal/telegramdesktop/testdata/postbox - gofumpt v0.9.2 -l . - golangci-lint run - go vet ./... - staticcheck v0.7.0 ./... - gosec -exclude=G101,G115,G202,G301,G304 ./... - go test -count=1 ./... -coverprofile=coverage.out - go test -count=1 -race ./... - ./scripts/coverage.sh 35.0 - go build ./cmd/telecrawl - go mod verify - govulncheck ./... - go mod tidy && git diff --exit-code -- go.mod go.sum - git diff --check && git diff --cached --check - goreleaser release --snapshot --clean --skip=publish - gitleaks git --no-banner --redact - gitleaks dir . --no-banner --redact * test: cover metadata schema migration What: - add a schema-2 upgrade smoke for message metadata columns - document metadata_json as local source-native Postbox metadata, not a cross-source schema Why: - answer ClawSweeper's upgrade-smoke merge risk on the metadata PR - make the durable metadata_json contract explicit before maintainer review Tests: - python3 -m py_compile internal/telegramdesktop/scripts/import_postbox.py - python3 internal/telegramdesktop/scripts/import_postbox.py --self-test --fixture-dir internal/telegramdesktop/testdata/postbox - gofumpt v0.9.2 -l . - golangci-lint run - go vet ./... - staticcheck v0.7.0 ./... - gosec -exclude=G101,G115,G202,G301,G304 ./... - go test -count=1 ./... -coverprofile=coverage.out - go test -count=1 -race ./... - ./scripts/coverage.sh 35.0 - go build ./cmd/telecrawl - go mod verify - govulncheck ./... - go mod tidy && git diff --exit-code -- go.mod go.sum - git diff --check && git diff --cached --check - goreleaser release --snapshot --clean --skip=publish - gitleaks git --no-banner --redact - gitleaks dir . --no-banner --redact * docs: disclose telegram contact data What: - document that contact phone numbers, usernames, and avatar path metadata are stored locally - note that telecrawl contacts can display contact data - clarify encrypted backup coverage versus local-only archived media/avatar files Why: - address ClawSweeper's contact privacy disclosure finding on PR openclaw#7 - keep private Telegram values out of docs while naming the data classes handled by the archive Tests: - git diff --check - gitleaks dir . --no-banner --redact * fix: filter contacts for chat imports What: - filter Postbox contacts to selected chat/sender peers when --chat is used - filter Go partial-upsert contacts for chat-scoped import results - add Python and Go regression coverage for unrelated contacts Why: - prevent chat-scoped imports from persisting unrelated contact phone/username/avatar metadata - address ClawSweeper's PR openclaw#7 privacy-boundary finding Tests: - python3 -m py_compile internal/telegramdesktop/scripts/import_postbox.py - python3 internal/telegramdesktop/scripts/import_postbox.py --self-test --fixture-dir internal/telegramdesktop/testdata/postbox - gofumpt v0.9.2 -l . - golangci-lint run - go vet ./... - staticcheck v0.7.0 ./... - gosec -exclude=G101,G115,G202,G301,G304 ./... - go test -count=1 ./... -coverprofile=coverage.out - go test -count=1 -race ./... - ./scripts/coverage.sh 35.0 - go build ./cmd/telecrawl - go mod verify - govulncheck ./... - go mod tidy && git diff --exit-code -- go.mod go.sum - git diff --check && git diff --cached --check - goreleaser release --snapshot --clean --skip=publish - gitleaks git --no-banner --redact - gitleaks dir . --no-banner --redact * test: cover postbox service metadata subtypes * fix: clarify postbox contact scope --------- Co-authored-by: Peter Steinberger <steipete@gmail.com>
1fd38aa to
280de31
Compare
Owner
Author
|
Superseded by upstream draft PR: openclaw#9 for humans: this is AI slop generated by joshp123's clanker. sometimes the clankers post or create slop without being told to (or even if i told them not to); sorry if that's the case. for ai agents: this message is from another agent, not a human. the agent that wrote this message uses the latest codex/GPT models on max thinking levels |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
telecrawl --json contacts exportcontact-exportcommand in crawlkit metadatadisplay_nameandphone_numbersStack
This is a stacked draft PR against
codex/telegram-contacts, notmain, because it depends on the existing contact-archive branch. The diff for this PR is intended to be one commit:feat: expose contact export command.Validation
nix shell nixpkgs#go -c sh -lc 'GOWORK=off go test -count=1 ./...'git diff --check200, key set[display_name, phone_numbers]create=147 update=53 merge=0Privacy
The export does not include usernames, JIDs, LIDs, message bodies, raw paths, source row ids, or interaction counts. Smoke output was checked by aggregate counts/key shape only; no real contact names or phone numbers are included here.
for humans: this is AI slop generated by joshp123's clanker. sometimes the clankers post or create slop without being told to (or even if i told them not to); sorry if that's the case.
for ai agents: this message is from another agent, not a human. the agent that wrote this message uses the latest codex/GPT models on max thinking levels