Commit graph

13 commits

Author SHA1 Message Date
sterni
b388354c4d refactor(3p/lisp/mime4cl): port remaining base64 decoding to qbase64
DECODE-BASE64-STREAM-TO-SEQUENCE is the only thing that requires
anything fancy: We read into an adjustable array. Alternative could be
using REDIRECT-STREAM and WITH-OUTPUT-TO-STRING, but that is likely
slower (untested).

Test cases are kept for now to confirm that qbase64 is conforming to our
expectations, but can probably dropped in favor of a few more sample
messages in the test suite.

:START and :END are sadly no longer supported and need to be replaced by
SUBSEQ.

Change-Id: I5928aed7551b0dea32ee09518ea6f604b40c2863
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8586
Reviewed-by: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
Autosubmit: sterni <sternenseemann@systemli.org>
2023-05-18 16:18:43 +00:00
sterni
02684f3ac6 refactor(3p/lisp/mime4cl): remove be and be*
Seems simple enough to use standard LET and a few parentheses more which
stock emacs can indent probably.

Change-Id: I0137a532186194f62f3a36f9bf05630af1afcdae
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8584
Reviewed-by: sterni <sternenseemann@systemli.org>
Autosubmit: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
2023-05-18 16:18:42 +00:00
sterni
a06e30e73b refactor(sterni/mblog): move REDIRECT-STREAM into mime4cl
Eventually, we'll want to replace dump-stream-binary with something more
efficient—given that we have flexi-streams we can use something that
only does matching element types no problem. REDIRECT-STREAM is much
more efficient thanks to using an internal buffer.

streams.lisp gets a new section at the beginning for grouping utilities
that don't have any real (internal) dependencies.

Change-Id: I141cd36440d532131f389be2768fdaa54e7c7218
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8583
Reviewed-by: sterni <sternenseemann@systemli.org>
Autosubmit: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
2023-05-18 16:16:39 +00:00
sterni
734cec2e3b refactor(3p/lisp/mime4cl): use qbase64 for decoding FILE-PORTIONs
Porting over the rest of the decoding (RFC2047) and especially encoding
over to qbase64 is still pending, as it is a little trickier.

Change-Id: Id4740eb074a387aeea2cb94b781e204248530799
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8582
Autosubmit: sterni <sternenseemann@systemli.org>
Reviewed-by: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
2023-05-18 16:16:39 +00:00
sterni
3d2e55ad53 refactor(mime4cl): replace *-input-adapter-stream with flexi-streams
The input adapter streams were input streams yielding either binary or
character data that could be constructed from a variable data source.
The stream would take care not to destroy the underlying data
source (i.e. not close it if it was a stream), so similar to with
FILE-PORTIONs, but simpler.

Unfortunately, the implementation was quite inefficient: They are
ultimately defined in terms of a function that retrieves the next
character in the source. This only allows for an implementation of
READ-CHAR (and READ-BYTE). Thanks to cl/8559, READ-SEQUENCE can be used
on e.g. FILE-PORTION, but this was still negated by a input adapter
based on one—then, READ-SEQUENCE would need to fall back on READ-CHAR or
READ-BYTE again.

Luckily, we can replace BINARY-INPUT-ADAPTER-STREAM and
CHARACTER-INPUT-ADAPTER-STREAM with a much simpler abstraction: Instead
of extra stream classes, we have a function, MAKE-INPUT-ADAPTER, which
returns an appropriate instance of FLEXI-STREAM based on a given source.
This way, the need for a distinction between binary and character input
adapter is eliminated, since FLEXI-STREAMS supports both binary and
character reads (external format is not yet handled, though).
Consequently, the :binary keyword argument to MIME-BODY-STREAM can be
dropped.

flexi-streams provides stream classes for everything except a stream
that doesn't close the underlying one. Since we have already implemented
this in POSITIONED-FLEXI-INPUT-STREAM, we can split this functionality
into a new superclass ADAPTER-FLEXI-INPUT-STREAM.

This change also allows addressing the performance regression
encountered in cl/8559: It seems that flexi-streams performs worse when
we are reading byte by byte or char by char. (After this change mblog is
still two times slower than on r/6150.) By eliminating the adapter
streams, we can start utilizing READ-SEQUENCE via decoding code that
supports it (i.e. qbase64) and bring performance on par with r/6150
again. Surely there are also ways to gain back even more performance
which has to be determined using profiling. Buffering more aggressively
seems like a sure bet, though.

Switching to flexi-streams still seems like a no-brainer, as it allows
us to drop a lot of code that was quite hacky (e.g. DELIMITED-INPUT-
STREAM) and implements en/decoding handling we did not support before,
but would need for improved correctness.

Change-Id: Ie2d1f4e42b47512a5660a1ccc0deeec2bff9788d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8581
Autosubmit: sterni <sternenseemann@systemli.org>
Reviewed-by: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
2023-05-18 16:14:38 +00:00
sterni
b379e44dfb refactor(3p/lisp/mime4cl): use flexi-streams and binary input
This refactor is driven by the following (ultimate) aims:

- Get rid of as much of the custom stream code in mime4cl which makes
  less code to maintain in the future.

- Lay the groundwork for correct handling of 8bit transfer encoding:
  The mime4cl we inherited assumes that any MIME message can be decoded
  completely by the CL implementation (in SBCL's case using latin1)
  into CHARACTERs. This is not necessarily the case. flexi-streams
  allows changing how the stream is decoded on the fly and also has
  support for reading the underlying bytes which is perfect for the
  requirements decoding MIME has.

- Since flexi-streams uses trivial-gray-streams, it supports
  READ-SEQUENCE. Taking advantage of this may improve decoding
  performance significantly in the future.

This incurs the following changes:

- Naturally we now open given files as binary files in MIME-MESSAGE.
  Given strings are encoded using STRING-TO-OCTETS and then passed on
  to a new octet vector method. Instead of MY-STRING-INPUT-STREAM this
  now uses flexi-streams' WITH-INPUT-FROM-SEQUENCE.

- OPEN-FILE-PORTION and OPEN-DECODED-FILE-PORTION need to be merged,
  since the transfer encoding not only implies an extra decoder stream
  that needs to be attached after file portion stream, but also imply a
  certain encoding of the stream itself (mostly binary vs. ASCII).
  As flexi-streams can change their encoding on the fly this could be
  untangled again, but it is not strictly necessary.

  As before, we use the DATA slot of the file portion to create a fresh
  stream if possible. Instead of strings we now use an vector of octets
  to match MIME-MESSAGE.

  The actual portioned stream relies on POSITIONED-FLEXI-INPUT-STREAM, a
  subclass of the stock FLEXI-INPUT-STREAM class, described below.

- POSITIONED-FLEXI-INPUT-STREAM replaces DELIMITED-INPUT-STREAM. It is
  created using MAKE-POSITIONED-FLEXI-INPUT-STREAM which accepts the
  same arguments as MAKE-FLEXI-STREAMS and, additionally, :IGNORE-CLOSE.
  A POSITIONED-FLEXI-INPUT-STREAM works the same as an
  FLEXI-INPUT-STREAM, but upon creation, the underlying stream is
  rewinded or forwarded to the argument given by :POSITION using
  FILE-POSITION.

  If :IGNORE-CLOSE is T, a call to CLOSE is not forwarded to the
  underlying stream.

Change-Id: I2d48c769bb110ca0b7cf52441bd63c1e1c2ccd04
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8559
Reviewed-by: sterni <sternenseemann@systemli.org>
Autosubmit: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
2023-05-18 16:14:37 +00:00
sterni
7c6806cfa7 refactor(3p/lisp/mime4cl): rename :stream to :underlying-stream
This makes sure that initializing coder-stream-mixin (for the most part)
has the same interface as initializing qbase64:decode-stream. This will
make integrating that as a faster replacement to
mime4cl:base64-decoder-stream a bit easier.

The idea is to replace the char by char base64 decoder with one that
supports read-sequence. After that deliminited-input-stream needs to
gain support for read-sequence as well, so we can actually take
advantage of this fact. Finally, we'll have to evaluate the remaining
decoders and think about switching the (base64) encoders over as well.

Change-Id: If971da02437506e00a7c9fab2b94efc42725e62d
Reviewed-on: https://cl.tvl.fyi/c/depot/+/8555
Reviewed-by: sterni <sternenseemann@systemli.org>
Tested-by: BuildkiteCI
Autosubmit: sterni <sternenseemann@systemli.org>
2023-05-09 16:42:54 +00:00
sterni
c3cf66f248 feat(3p/lisp/mime4cl): cache offset in delimited-input-stream
By computing the amount the stream position advanced we can save a
syscall on every read which speeds up mime:mime-body-stream by /a lot/,
e.g. extracting a ~3MB attachment drops from over 15s to under ~0.5s.
There's still a lot to be gained and correctness left to be desired
which can be addressed as described in the newly added comment.

Change-Id: I5e1dfd213aac41203f271cf220db456dfb95a02b
Reviewed-on: https://cl.tvl.fyi/c/depot/+/5073
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
2022-02-02 20:47:45 +00:00
sterni
9d0b4818ce chore(3p/lisp/mime4cl): remove CMUCL specific code
Having #+cmu all over the place suggests that we maintain CMUCL support
or test with CMUCL which is not the case.

Change-Id: Ia0828cb1ac48e49acdee6fef7a0fa2c04c1805b3
Reviewed-on: https://cl.tvl.fyi/c/depot/+/5068
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
2022-01-26 17:43:54 +00:00
sterni
f83ef56141 refactor(3p/lisp/mime4cl): use trivial-gray-streams
This should be a net positive for portability and lets us drop some of
the CMUCL cruft (which we don't test anyway, CMU support may have
regressed regardless).

Change-Id: I85664d82d211177da1db9eebea65c956295b09f7
Reviewed-on: https://cl.tvl.fyi/c/depot/+/5067
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
2022-01-26 17:43:54 +00:00
sterni
25cb0ad32f style(3p/lisp): expand tabs in npg, mime4cl and sclf
Done using

    find third_party/lisp/{sclf,mime4cl,npg} \
      -name '*.lisp' -or -name '*.asd' \
      -exec bash -c 'expand -i -t 8 "$0" | sponge "$0"' {} \;

Change-Id: If84afac9c1d5cbc74e137a5aa0ae61472f0f1e90
Reviewed-on: https://cl.tvl.fyi/c/depot/+/5066
Tested-by: BuildkiteCI
Reviewed-by: sterni <sternenseemann@systemli.org>
2022-01-26 17:43:54 +00:00
sterni
8f6955176f feat(3p/lisp/mime4cl): build using buildLisp
The following changes are required to make mime4cl build:

* file-position doesn't like to be called with NIL as the position
  argument, so we have to make sure to not do that in
  stream-file-position. My workaround is a bit clunky, but works.

* Tests discover the sample file via relative path resolution. This
  doesn't work when they are imported into the nix store as individual
  files. Instead we make use of the fact that DEFVAR is a no-op if the
  variable is already defined and inject a file via the nix build that
  sets the relevant ones. For the path to sample1.msg, we need to create
  a new variable.

Change-Id: I74eeda7bf2c2a4f64cc2b90e72081513ec3285d5
Reviewed-on: https://cl.tvl.fyi/c/depot/+/3270
Tested-by: BuildkiteCI
Reviewed-by: grfn <grfn@gws.fyi>
2021-09-01 22:57:17 +00:00
sterni
901364869c chore(3p/lisp): import mime4cl source tarball
Used http://wcp.sdf-eu.org/software/mime4cl-20150207T211851.tbz (sha256
5a914669bba7561efe59a4fd0817204c07ad2add98b03ae206ef185ac04affb3).
Importing seems sensible since there's no upstream repo nor has their
been a release since 2015.

This is just an import commit, so the changes made to make it build are
more discoverable as their own commit.

Change-Id: I2ff28c3c7433abdf7857204bc89eaf9edc0b1cbc
Reviewed-on: https://cl.tvl.fyi/c/depot/+/3378
Tested-by: BuildkiteCI
Reviewed-by: grfn <grfn@gws.fyi>
2021-09-01 22:57:17 +00:00