Skip to content

Commit 26c9d0b

Browse files
committed
spec: clarify base64 encoding, add reserved User namespace
1 parent d210ba2 commit 26c9d0b

File tree

1 file changed

+15
-20
lines changed

1 file changed

+15
-20
lines changed

doc/guano_specification.md

Lines changed: 15 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -46,15 +46,6 @@ stated so.
4646
Definitions and Common Data Conventions
4747
---------------------------------------
4848

49-
All GUANO metadata must be persisted in big-endian format; multi-byte values
50-
are to be written such that the most significant byte has the lowest address
51-
and the least significant byte has the highest address. This is because files
52-
are written once, but read many times; by standardizing on an endianness we
53-
ease the burden on subsequent processing and analysis, regardless of hardware
54-
platform used for recording. This has no bearing on whether recorders choose
55-
to write little- or big-endian .WAV data, as specified in the .WAV (RIFF) file
56-
header; the GUANO metadata itself must be written big-endian.
57-
5849
All GUANO metadata must be persisted as UTF-8 Unicode string. This is a multi-
5950
byte encoding which uses just a single byte for all "ASCII" data, but a
6051
variable number of bytes for encoding "special" characters.
@@ -73,13 +64,9 @@ string "\n" as a newline. At this time, this specification makes no attempt
7364
to define an escape for encoding the literal string "\n" with a meaning apart
7465
from "newline".
7566

76-
Binary field values should be encoded as Base64. However, Base64 enforces
77-
a maximum line length, and the GUANO metadata format thus far delimits fields
78-
by newline. Enforcing a short line length for potentially-large binary values
79-
would ease the development of reading implementations which must allocate
80-
memory to read in lines. What is the best way to support these multi-line,
81-
potentially large (perhaps megabytes in size for an embedded voice note, for
82-
example) binary values?
67+
Binary field values should be encoded as Base64 strings as defined in
68+
[RFC 4648](https://www.ietf.org/rfc/rfc4648.txt). Newlines may not be inserted
69+
into the data, and the "Base 64 Alphabet" must be used.
8370

8471
Extra whitespace may be used when formatting field names and values; whitespace
8572
should be trimmed upon reading. This gives writing implementations freedom to
@@ -196,6 +183,12 @@ this list so that it isn't accidentally used by another manufacturer.
196183
This reserved namespace is for meta-metadata pertaining specifically to the
197184
GUANO metadata in use.
198185

186+
**User**
187+
Reserved namespace for user-defined fields.
188+
189+
**Anabat**
190+
Titley Scientific
191+
199192
**BAT**
200193
Binary Acoustic Technologies
201194

@@ -208,9 +201,6 @@ this list so that it isn't accidentally used by another manufacturer.
208201
**SB**
209202
SonoBat
210203

211-
**Anabat**
212-
Titley Scientific
213-
214204
**WAC**
215205
Wildlife Acoustics
216206

@@ -325,6 +315,9 @@ fields in a compliant GUANO file.
325315
Specification History
326316
---------------------
327317

318+
2016-03-02 | 0.0.3 | Clarified Base64 encoding of binary data. Added `User` namespace. Removed
319+
mention of UTF-8 endianness.
320+
328321
2016-01-30 | 0.0.2 | Added well-known fields: Hardware Version, Firmware Version, Temperature, Humidity.
329322
Clarified Loc Position description.
330323

@@ -334,4 +327,6 @@ Specification History
334327
Notes
335328
-----
336329

337-
* The use of manufacturer or product names in this specification does not imply endorsement, support, or any other association by those manufacturers or products; nor does it imply compliance with the GUANO specification.
330+
* The use of manufacturer or product names in this specification does not imply endorsement,
331+
support, or any other association by those manufacturers or products; nor does it imply compliance
332+
with the GUANO specification.

0 commit comments

Comments
 (0)