unweave: Unweave interleaved streams of text lines
December 4, 2022When trying to debug unexpected behaviors in applications and libraries, I often need to understand long and complex logs. Such logs typically contain interleaved lines originating from multiple components and multiple threads and processes. In addition to the actual message content, part of the information is conveyed through the temporal relationship of the log entries.
However, such temporal relationships are sometimes hard to discern in the final interleaved output. A different view, in which each source stream of log entries is placed into its own column, can often provide more, or different, insights:
This separated, or unweaved, view has proven to be particularly useful for me
in the development of the Wine Wayland
driver, which is what
triggered the development of the
unweave tool. Since Wine logs can be
very long and involve several interacting sources of information, efficiency and
versatility were a primary goal of unweave
.
For a practical example of unweave
's capabilities, consider a log file that
contains info
or error
output from 3 threads A
, B
, C
:
[info] A: 1
[info] A: 2
[info] B: 1
[error] A: 3
[info] B: 2
[error] C: 1
A natural transformed view of the log involves placing each thread in its own column. This is straightforward with unweave:
$ unweave 'A|B|C' input
[info] A: 1
[info] A: 2
[info] B: 1
[error] A: 3
[info] B: 2
[error] C: 1
Typically, the output format is more complex and the relevant values not known
in advance. For example, you may not know the exact thread names, only that
they appear in the log in the form of $TID:
where $TID
is an uppercase
alphabetic string. The pattern provided to unweave
is a fully operational
battle station regular expression, so it can handle all this too, along
with making the columns wider:
$ unweave -c 15 ' ([A-Z]+): ' input
[info] A: 1
[info] A: 2
[info] B: 1
[error] A: 3
[info] B: 2
[error] C: 1
Perhaps it would be interesting to get a different view, with columns based on log types instead of threads, and also use a more distinctive column separator:
$ target/release/unweave -s ' | ' '^\[\w+\]' input
[info] A: 1 |
[info] A: 2 |
[info] B: 1 |
| [error] A: 3
[info] B: 2 |
| [error] C: 1
Although separating streams into columns is the primary mode of the unweave
tool, it can sometimes be useful to use separate files instead, with each
output filename containing the stream tag and the stream id:
$ unweave --mode=files -o 'stream-%t-%2d' 'A|B|C' input
$ tail -n +1 stream-*
==> stream-A-00 <==
[info] A: 1
[info] A: 2
[error] A: 3
==> stream-B-01 <==
[info] B: 1
[info] B: 2
==> stream-C-02 <==
[error] C: 1
Visit the unweave repository or view the manpage to find out more!