docs(tvix/docs/TODO): document ChunkService split idea
Change-Id: Ie9c88b0d14902c642e2d3d6603265688eef0e10d Reviewed-on: https://cl.tvl.fyi/c/depot/+/11755 Reviewed-by: yuka <yuka@yuka.dev> Tested-by: BuildkiteCI Autosubmit: flokli <flokli@flokli.de>
This commit is contained in:
parent
c4d4cce657
commit
154e0d71e0
1 changed files with 16 additions and 1 deletions
|
@ -178,7 +178,22 @@ logs etc, but this is something requiring a lot of designing.
|
||||||
### BlobService
|
### BlobService
|
||||||
- On the trait side, currently there's no way to distinguish reading a
|
- On the trait side, currently there's no way to distinguish reading a
|
||||||
known-chunk vs blob, so we might be calling `.chunks()` unnecessarily often.
|
known-chunk vs blob, so we might be calling `.chunks()` unnecessarily often.
|
||||||
At least for the `object_store` backend, this might be a problem.
|
At least for the `object_store` backend, this might be a problem, causing a
|
||||||
|
lot of round-trips. It also doesn't compose well - every implementation of
|
||||||
|
`BlobService` needs to both solve the "holding metadata about chunking info"
|
||||||
|
as well as "storing chunks" questions.
|
||||||
|
Design idea (@flokli): split these two concerns into two separate traits:
|
||||||
|
- a `ChunkService` dealing with retrieving individual chunks, by their
|
||||||
|
content digests. Chunks are small enough to keep around in contiguous
|
||||||
|
memory.
|
||||||
|
- a `BlobService` storing metadata about blobs.
|
||||||
|
|
||||||
|
Individual stores would not need to implement `BlobReader` anymore, but that
|
||||||
|
could be a global thing with access to the whole store composition layer,
|
||||||
|
which should make it easier to reuse chunks from other backends. Unclear
|
||||||
|
if the write path should be structured the same way. At least for some
|
||||||
|
backends, we want the remote end to be able to decide about chunking.
|
||||||
|
|
||||||
- While `object_store` recently got support for `Content-Type`
|
- While `object_store` recently got support for `Content-Type`
|
||||||
(https://github.com/apache/arrow-rs/pull/5650), there's no support on the
|
(https://github.com/apache/arrow-rs/pull/5650), there's no support on the
|
||||||
local filesystem yet. We'd need to add support to this (through xattrs).
|
local filesystem yet. We'd need to add support to this (through xattrs).
|
||||||
|
|
Loading…
Reference in a new issue