unweave: Unweave interleaved streams of text lines

When trying to debug unexpected behaviors in applications and libraries, I often need to understand long and complex logs. Such logs typically contain interleaved lines originating from multiple components and multiple threads and processes. In addition to the actual message content, part of the information is conveyed through the temporal relationship of the log entries.

However, such temporal relationships are sometimes hard to discern in the final interleaved output. A different view, in which each source stream of log entries is placed into its own column, can often provide more, or different, insights:

Unweave example

This separated, or unweaved, view has proven to be particularly useful for me in the development of the Wine Wayland driver, which is what triggered the development of the unweave tool. Since Wine logs can be very long and involve several interacting sources of information, efficiency and versatility were a primary goal of unweave.

For a practical example of unweave's capabilities, consider a log file that contains info or error output from 3 threads A, B, C:

[info] A: 1
[info] A: 2
[info] B: 1
[error] A: 3
[info] B: 2
[error] C: 1

A natural transformed view of the log involves placing each thread in its own column. This is straightforward with unweave:

$ unweave 'A|B|C' input
[info] A: 1
[info] A: 2
            [info] B: 1
[error] A: 3
            [info] B: 2
                       [error] C: 1

Typically, the output format is more complex and the relevant values not known in advance. For example, you may not know the exact thread names, only that they appear in the log in the form of $TID: where $TID is an uppercase alphabetic string. The pattern provided to unweave is a fully operational battle station regular expression, so it can handle all this too, along with making the columns wider:

$ unweave -c 15 ' ([A-Z]+): ' input
[info] A: 1
[info] A: 2
               [info] B: 1
[error] A: 3
               [info] B: 2
                              [error] C: 1

Perhaps it would be interesting to get a different view, with columns based on log types instead of threads, and also use a more distinctive column separator:

$ target/release/unweave -s ' | ' '^\[\w+\]' input
[info] A: 1 |
[info] A: 2 |
[info] B: 1 |
            | [error] A: 3
[info] B: 2 |
            | [error] C: 1

Although separating streams into columns is the primary mode of the unweave tool, it can sometimes be useful to use separate files instead, with each output filename containing the stream tag and the stream id:

$ unweave --mode=files -o 'stream-%t-%2d' 'A|B|C' input
$ tail -n +1 stream-*
==> stream-A-00 <==
[info] A: 1
[info] A: 2
[error] A: 3

==> stream-B-01 <==
[info] B: 1
[info] B: 2

==> stream-C-02 <==
[error] C: 1

Visit the unweave repository or view the manpage to find out more!