You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 22, 2020. It is now read-only.
I'm curious if you've looked into GridFS support, being that gridfs is split across 2 collections, they're consistently named (fs.files, fs.chunks), and the standalone adapter for gridfs file getting (by filename or id), I think it merits its own functionality, rather than just mappign both collections to postgres and trying to do assembly on that side. I did some preliminary testing to see if it could work (using '$gridfs' special as a source to trigger gridfs, and then using orig document to grab gridfs file by id)
I'm currently running into some issues with encoding however, some imports succeeding (large plaintext files, some pdfs) and then failing at one point on others on
# in transform_to_copy()
'join': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
my modification of fetch_special_source():
def fetch_special_source(db, ns, obj, source, original)
case source
when "$timestamp"
Sequel.function(:now)
when "$gridfs"
dbname, collection = ns.split(".", 2)
if collection == 'fs.files'
grid = Grid.new(db)
file = grid.get(original["_id"])
Sequel::SQL::Blob.new(file.read)
end
when /^\$exists (.+)/
# We need to look in the cloned original object, not in the version that
# has had some fields deleted.
fetch_exists(original, $1)
else
raise SchemaError.new("Unknown source: #{source}")
end
end
(I also tried various combinations of hex transforms and utf8 encoding, it still ended up eventually giving me that ASCII error, for reference my column type its inserting into is BYTEA)
Also, I had to add db adapter arguments in all methods up from fetch_special_source() in shema.rb to import_collections() in streamer.rb inorder to create the gridfs object instance in fetch_special_source(), this seems bad, recommendation for where to stick it?
The text was updated successfully, but these errors were encountered:
Hey – I haven't looked at implement gridfs support, since I don't use it anywhere.
I agree that adding support might be useful, and I'd consider a PR. It'd probably be easier to review a strawman PR than try to speculate about the code via a description.
hex-encoding the binary data is probably the way forward to fix the encoding issue, but I'd try to replicate it in a test and then add a bunch of debug prints or thereabouts to understand what's going on.
I'm curious if you've looked into GridFS support, being that gridfs is split across 2 collections, they're consistently named (fs.files, fs.chunks), and the standalone adapter for gridfs file getting (by filename or id), I think it merits its own functionality, rather than just mappign both collections to postgres and trying to do assembly on that side. I did some preliminary testing to see if it could work (using '$gridfs' special as a source to trigger gridfs, and then using orig document to grab gridfs file by id)
I'm currently running into some issues with encoding however, some imports succeeding (large plaintext files, some pdfs) and then failing at one point on others on
my modification of fetch_special_source():
(I also tried various combinations of hex transforms and utf8 encoding, it still ended up eventually giving me that ASCII error, for reference my column type its inserting into is BYTEA)
Also, I had to add db adapter arguments in all methods up from fetch_special_source() in shema.rb to import_collections() in streamer.rb inorder to create the gridfs object instance in fetch_special_source(), this seems bad, recommendation for where to stick it?
The text was updated successfully, but these errors were encountered: