Jump to content
  • Sky
  • Blueberry
  • Slate
  • Blackcurrant
  • Watermelon
  • Strawberry
  • Orange
  • Banana
  • Apple
  • Emerald
  • Chocolate
  • Charcoal

Solra Bizna

Members
  • Content Count

    90
  • Joined

  • Last visited

  • Days Won

    8

Solra Bizna last won the day on May 16 2016

Solra Bizna had the most liked content!

1 Follower

About Solra Bizna

  • Rank
    Junior Member
  • Birthday 04/28/1989

Contact Methods

  • AIM
    solrabizna
  • Jabber
    sbizna@tejat.net
  • Skype
    solrabizna
  • Minecraft
    SolraBizna
  • GitHub
    SolraBizna

Profile Information

  • Gender
    Male

Recent Profile Visitors

872 profile views
  1. I edited it to remove the 4-byte padding from the binary boot records. My reasoning is that, on IO port heavy 8-bit architectures (such as OCMOS), tracking the current position and skipping a specified number of bytes is clumsier than just reading until the NUL; whereas on alignment-sensitive architectures, it's not terribly difficult to do an unaligned read "by hand".
  2. Cross-Architecture Booting Universal Interchange Format OpenComputers Ethernet (reserved by myself) OCranet family of protocols: OCR (OCranet Relay) OCranet family of protocols: NNR (Network-to-Network Routing) Global Empire Routing Technology ON2 - Simple L2 protocol for network stacks Allocated Network Card Port Numbers (reserved by MajGenRelativity) If you'd like to reserve an OETF document number, contact me, as I appear to have become the central authority for them. Here are the ways in which I can be contacted, in descending order of how quickly I will see your request: SolraBizna on IRC (irc.tejat.net, irc.esper.net) solra@bizna.name via email Private message on these forums If you catch me at a good time, I'll have a few things to suggest on the topic of your document. If you catch me at a bad time, I'll probably just reserve you a number without question.
  3. They should be atomic, because they do not contain other Values. (If I have understood the question correctly.)
  4. THIS IS A DRAFT. It may change before becoming "official". Please feel free to suggest breaking changes. Abstract This document provides a binary interchange format, intended primarily to support generic component IO. Rationale OpenComputers' component bus is designed for high-level languages. It sends and receives groups of dynamically typed values. It is intended to be user-friendly and self-discoverable, and it has largely achieved this goal. However, with low-level architectures, there is no obvious, straightforward way to represent these values. This document aims to provide a standard representation, freeing individual architects from having to devise their own representations, and minimizing unnecessary differences between architectures. Every value that can be sent over an OpenComputers bus can be represented as described in this document, and (barring length restrictions) vice versa. Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. All signed integers are two's-complement. Concepts Tag: Gives the type of a subsequent Value. Value: Data whose structure and meaning depend on its type. Tagged Value: A Tag, followed by a Value of the indicated type. Producer: A program or process that generates data in this format. Consumer: A program or process that consumes data in this format. Packed mode: A representation designed to occupy very little space. Unpacked mode: A representation designed to be easy to manipulate on 32-bit architectures. Tags A type tag denotes the type of a subsequent Value. In Packed mode, the tag is a 16-bit signed integer. In Unpacked mode, it is a 32-bit signed integer, aligned to a 4-byte boundary. Values String (UIFTAG_STRING = 0x0000–0x3FFF = 0–16383) A UTF-8 code sequence. The tag provides the length, in bytes, of the sequence. Producers MUST NOT generate invalid code sequences, including "modified UTF-8" conventions such as non-zero NUL and UTF-8 encoded surrogate pairs. Producers MUST NOT arbitrarily prefix strings with a spurious U+FEFF BYTE ORDER MARK. Consumer handling of invalid code sequences is undefined. If a Consumer encounters a String where a Byte Array is expected, the Consumer MAY incur a round-trip conversion to its native string type. This may mean that the bytes the Consumer actually sees differ from the original bytes where invalid code sequences occur. Consumers MUST handle NUL bytes in a String in an appropriate manner. Consumers MUST not assume that Strings are NUL-terminated—they are not. In Unpacked mode, additional zero bytes MUST be added to the end of the String, so that a subsequent Tag will be aligned to a 4-byte boundary. Byte Array (UIFTAG_BYTE_ARRAY = 0x4000-0x7FFF = 16384-32767) An arbitrary sequence of bytes. The tag, minus 16384, provides the length in bytes of the sequence. If a Consumer encounters a Byte Array where a String is expected, the Consumer MUST interpret the Byte Array as if it were a String of the given length. In Unpacked mode, additional zero bytes MUST be added to the end of the Byte Array, so that a subsequent Tag will be aligned to a 4-byte boundary. End (UIFTAG_END = 0x...FFFF = -1) A special tag signifying the end of an Array or Compound. Null (UIFTAG_NULL = 0x...FFFE = -2) Absence of a value. Equivalent to null and nil in various programming languages. (Note: there is no tag -3.) Double (UIFTAG_DOUBLE = 0x...FFFC = -4) A 64-bit IEEE 754 floating point value. Consumers that encounter a Double where an Integer is expected MAY fail. Producers that are producing a Double which has an exact Integer representation SHOULD produce that Integer instead. Integer (UIFTAG_INTEGER = 0x...FFFB = -5) A 32-bit signed integer. Consumers that encounter an Integer where a Double is expected MUST convert the Integer to a Double. Array (UIFTAG_ARRAY = 0x...FFFA = -6) A series of Tagged Values, in a particular order, terminated by an End. Compound (UIFTAG_COMPOUND = 0x...FFF9 = -7) A series of pairs of Tagged Values. The order of the pairs is not significant. Each pair consists of a Key and a Value, in that order. A Key may be any type except a Byte Array, a Null, an Array, or a Compound. A Value may be any type. The list is terminated by an End. If a Consumer encounters an End as the second element of a pair, the result is undefined. UUID (UIFTAG_UUID = 0x...FFF8 = -8) A 128-bit RFC 4122 UUID. Regardless of endianness, the bytes are in display order. Consumers that encounter a UUID where a String is expected MUST convert the UUID to its canonical string representation, in lowercase. Producers and Consumers alike should take note that a random sequence of bytes is not necessarily a valid UUID. True (UIFTAG_TRUE = 0x...FFF7 = -9) A boolean true value. False (UIFTAG_FALSE = 0x...FFF6 = -10) A boolean false value. TODO This document is incomplete. Still to be written: recommendations on endianness and packing, useful common optimizations.
  5. THIS IS A DRAFT. It may change before becoming "official". Please feel free to suggest breaking changes. Abstract This document provides guidelines for dealing with EEPROMs and locating architecture-specific boot code. Rationale OpenComputers supports a wide variety of architectures. Even more so than in the real world, OpenComputers architectures can differ dramatically from one another. Some architectures run programs in a particular high-level language directly, while others simulate real or fictitious low-level ISAs. Some architectures natively deal with data in 8-bit units, while others have built-in advanced string handling and vector processing capabilities. In contrast with this variety, OpenComputers components have a standard interface. An EEPROM containing boot code can easily end up being used on the "wrong" architecture, to say nothing of boot disks. This standard aims to solve that problem, by providing architecture-aware guidelines for dealing with EEPROMs, and procedures for locating boot code on filesystem-based and sector-based boot media. It aims to be simple to implement on the widest possible variety of architectures. Conventions Unless otherwise specified, all references to text imply 7-bit ASCII codes. Behavior on encountering bytes with the high-bit set is undefined. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. Architecture Identifier Each architecture must provide a unique, preferably meaningful identifier which is specific to that architecture. This is the Architecture Identifier, or AID. The AID SHOULD be the same as the Architecture.Name annotation for the architecture, which is usually the same as the name used in the tooltip of the CPU. An AID MUST contain no bytes other than the following characters: digits, upper and lowercase letters, periods (.), dashes (-), underscores (_), slashes (/), spaces. In addition, an AID MUST NOT begin with a space, end with a space, or contain a run of two or more spaces. An AID SHOULD begin with a capital letter, if for no other reason than to make boot code easier to tell apart from other files and directories in listings. A new AID SHOULD NOT contain spaces, unless it is required for compatibility. An architecture with multiple variants that are mutually incompatible SHOULD use different AIDs for each variant. Fallback schemes, such as one where multiple AIDs are tried, are architecture-specific and outside the scope of this document. CAB-aware EEPROM images ("CABE images") This section applies to architectures, EEPROM flashing utilities, and any hardcoded boot code (analogous to machine.lua in the "standard" Lua architecture). A CAB-aware EEPROM image (a "CABE image") MUST begin with "--[", followed by zero or more "=", followed by "[CABE:". This is the prefix string. A CAB-compliant architecture MUST be prepared to deal with any number of "=" between 0 and 7, and a CABE image SHOULD NOT use more than 7. The prefix string is followed by a single AID. This denotes the "intended architecture" for this CAB image. This is followed by one of two things: A colon (:), in which case the "main body" consists of the bytes following the colon, up until a valid suffix string is encountered. A suffix string, in which case the "main body" consists of the entire remainder of the image, and MUST be valid Lua 5.2 AND Lua 5.3 code. A "suffix string" consists of an ASCII "]", followed by the same number of "=" as were in the prefix string, followed by another "]". In order to be a valid suffix string, it MUST contain precisely the same number of "=" as the prefix string contains. In particular, an otherwise valid suffix string that contains more "=" than the prefix string MUST be treated exactly the same as any other non-suffix-string byte sequence. The "main body" contains the actual EEPROM data for that architecture. The interpretation of this data is architecture-defined. If a CABE image contains data after a valid suffix string, that data MUST be Lua code which is cross-compatible between the Lua 5.2 and 5.3 architectures provided by OpenComputers. If this is not the intended architecture of the CABE image, this code SHOULD consist solely of an informative error call. Any deviation whatsoever from this standard results in a non-CABE image. All handling of non-CABE images is architecture-specific, but a non-CABE image SHOULD be treated however the "main body" of a valid CABE image would be treated. The preferred file extension for CABE images is ".cabe". The preferred MIME type is "application/x-cabe-image", though "application/octet-stream" is acceptable. CABE images MUST NOT be distributed using any "text/*" MIME type, as doing so will almost certainly corrupt binary CABE images. Here is an example CABE image targeting a fictional architecture: --[[CABE:HyperTalk: ask "What is your name?" answer "Hello," && it & "!" ]] error"HyperTalk architecture required" And here is one targeting a built-in Lua architecture: --[[CABE:Lua 5.2]] for n=1,5 do computer.beep(2000, 0.1) end EEPROM-based transparent architecture switching A future OpenComputers release may add support for transparent architecture switching, through additional NBT data for EEPROMs. It is expected that this will consist of a single AID identifying the architecture the EEPROM is designed for. This section applies in the event of such support becoming available. To provide consistency to users, architectures SHOULD NOT attempt to implement any form of automatic architecture switching themselves. An EEPROM flashing utility SHOULD attempt to parse all images as CABE images. If successful, it SHOULD tag the EEPROM appropriately and burn only the "main body". Otherwise, it SHOULD remove any existing architecture tag from the EEPROM and burn the entire image. An architecture booting an EEPROM with a valid architecture tag SHOULD NOT also attempt to parse it as a CABE image. An architecture booting an EEPROM with no valid architecture tag SHOULD attempt to parse it as a CABE image, and MUST NOT affix an architecture tag itself. Boot code This section applies to EEPROMs intended as first-stage bootloaders, as well as programs intended to be booted by such EEPROMs. Bootloaders SHOULD, if the boot EEPROM contains a boot device UUID, attempt to boot from that device first. A boot EEPROM contains a boot device UUID if eeprom.getData() consists entirely of an ASCII UUID, or if it begins with an ASCII UUID followed by a null byte. Managed mode (filesystems) Bootloader behavior on managed filesystems: If "/<AID>" exists and is a directory, boot "/<AID>/boot". If "/<AID>" exists and is a file, boot "/<AID>". Any further cases are architecture-specific. If one of the above conditions are met, but its booting attempt fails, the booting process MUST NOT continue automatically. For instance, if "/<AID>" is a directory but booting "/<AID>/boot" fails, the bootloader MUST either fail with an error, or prompt a user for further action. Exactly what "booting" entails is architecture-specific. On a Lua architecture, it consists of loading a file as Lua code and then executing it. On low-level architectures, it might consist of loading a file's contents to a fixed RAM address and jumping into it. Architectures SHOULD provide a standard way for the first-stage bootloader to tell the booted code the UUID of the filesystem it was loaded from. Example: Consider a boot on the OC-ARM architecture. The bootloader checks if "/OC-ARM" exists. It does exist, and is a directory. The bootloader then attempts to boot "/OC-ARM/boot". It fails, because "/OC-ARM/boot" is not valid. It crashes the machine with an error message explaining the problem. Unmanaged mode (drives) A CAB-compliant bootable disk begins with a boot sector. This boot sector MUST be the first or second sector of the drive. If both the first and second sectors contain a valid boot sector, only the first one will be used. A boot sector begins with the ASCII string "CAB", followed by zero or more text boot records. This list of text records is terminated with an exclamation mark. If this exclamation mark is followed by the particular byte sequence {0x00, 0x1A, 0xCA, 0xBD} (null byte, CP/M end-of-file marker, two-byte magic number), then it is followed by zero or more binary boot records, terminated by a null byte. Boot records MUST NOT extend past the end of the boot sector. Architectures MAY specify that boot records for that architecture must be text or must be binary, and MAY specify that binary boot records must be a particular endianness and/or must be sector-aligned. Bootloaders for architectures that do not specify that boot records must be text or must be binary MUST support both. A text record matches ":<AID>=<offset>+<length>". <AID> is the AID for which the code is intended. <offset> is a decimal number, giving the byte offset at which to begin reading, OR "s" followed by a decimal number, giving the sector number at which to begin reading. <length> is a decimal number giving the number of bytes to read. A binary record is described by the following C99 structure: struct { uint8_t record_length; uint8_t flags; uint16_t load_start; uint32_t load_length; char aid[]; }; record_length is the number that must be added to the offset of this record to skip it. It MUST be equal to 8 + AID length + 1. flags is a bitfield. The following flags are defined: 0x40: If set, load_start is a sector number. If clear, load_start is a byte offset. 0x80: If set, load_start and load_length are little-endian. If clear, load_start and load_length are in network byte order (big-endian). load_start is either a sector number or a byte offset to begin the loading process at. load_length is the number of bytes to read. aid is the AID of the intended architecture, and MUST be null-terminated. As with managed mode, exactly what is done with the loaded data is architecture-specific. Bootloaders that only support binary records should consider a sector to be a valid boot sector if it begins with "CAB", and locate the end of the text boot records without parsing them by searching for the first "!". Bootloaders that only support text records need not consider any bytes past the first null byte. A boot sector that contains no records is valid, and MUST prevent any attempt to read possible subsequent boot sectors. Example 1: CAB:Lua 5.2=s3+17:Lua 5.3=s3+17:HyperTalk=384+5100! (followed directly by binary data:) 001A CABD (valid binary records follow) 0F (the length of the first record) C0 (little-endian, sector offset) 0900 (start at sector number 9) 0000 0100 (load 65536 bytes) 5342 3635 3032 00 (null-terminated string "SB6502") 00 (no more binary records) This drive contains valid boot code for Lua 5.2, Lua 5.3, HyperTalk, and SB6502. Lua 5.2 and 5.3 both use the same boot code, which is 17 bytes long and starts at the beginning of sector number 3. The HyperTalk boot code is 5100 bytes long and starts 384 bytes into the disk, which, when using 256-byte sectors, is halfway through the second sector. The SB6502 boot code is 65536 bytes long and starts at the beginning of sector number 9. Example 2: CAB! This is a valid boot sector, but contains no boot records. This is the safest way to mark a drive as non-bootable.
  6. OC-ARM's timing simulation isn't very detailed. Every instruction and every memory access incurs a cycle cost with no simulation of contention or other wait states. As much as I enjoy optimizing around low-level details like bus timings, a less precise scheme like that is probably enough for OpenComputers purposes.
  7. It's a pain, certainly. It doesn't require a general-purpose multiply, though. Multiplying by 10 just involves two two-bit shifts and two adds. Division is much harder on some architectures than others; using sector offsets makes things much easier on those architectures. Adding a division routine could easily make the bootloader no longer fit into ROM. (Correct me if I'm wrong, but I was under the impression that the sector-based read code takes a sector number, not a byte offset. Also, the sector size may change between different configurations or different devices, so it can't be hardcoded.) I definitely like the idea of separate text and binary boot records. Different architectures are likely to have radically different capabilities in terms of text processing vs. binary processing. Some bootloaders/architectures could indicate that they only respect one or the other; a JavaScript architecture might only respect text boot sectors, for example, while a 6502 architecture might only respect binary boot sectors. I'll write up a new proposal soon. (Also, I have now learned an important lesson about how the subscription system on this forum works. I have a temporary glut of free time, so we'll see what happens.)
  8. The Lua architectures boot a built-in "machine.lua", which boots "/init.lua". OC-ARM's (non-built-in) boot0 used to boot "/boot/arm.elf", but this was changed to "/OC-ARM" for three reasons: Compatible with OpenOS (which would try, and fail, to run "/boot/arm.elf" as Lua code on startup) No conflict with OpenArms (or other ARM architectures) Similar to standard practice for OS kernels (/mach_kernel, /vmlinuz, /kernel...) Naturally, more advanced bootloaders will include ways of specifying alternate boot targets, but it makes sense to have a standard process for determining the default on a given architecture. I'd propose that the "standard default" be "/<arch-name>" where <arch-name> is a unique, short name for the architecture in question. (Generic names like "ARM" or "MIPS" should be avoided in favor of specific names like "OC-ARM" and "OCMIPS" and "OpenArms".) Whether this is an ELF file or whatever else would be architecture-specific.That's all well and good for filesystem-based booting, but what about drive-based booting? For that, I propose that the first sector of a bootable drive begin with a block in a format like the following: BOOT:Lua 5.2,Lua 5.3:s1+1199;OC-ARM:s6+2472;MikeKaraoke:b64+102. The meaning of this particular block would be "Lua 5.2 and Lua 5.3 compatible code starts at sector 1, and is 1199 bytes long. OC-ARM compatible code starts at sector 6, and is 2472 bytes long. MikeKaraoke compatible code starts at byte 64 of the boot sector (which is right after this block), and is 102 bytes long." (Most drives would only be bootable for one architecture.) This syntax is potentially extensible, depending on the details of the specification. Bootable code would be required to either start on a sector boundary or be contained entirely in the first sector. Under these circumstances, the bootloader can account for the raw sector size fairly simply, even where division is prohibitively expensive. The text parsing required is also fairly simple, even on an 8-bit architecture like the 6502.This idea is compatible with any partition table scheme of the "data at end of sector" type, like the MBR scheme. (Here I have assumed that the first sector is numbered 0. If the sectors are 1-based, the spec will have to be changed to reflect that.) I'd be happy to write all this up as a (pair of?) pseudo-RFC but I'm open to feedback first. I've heard that this topic has been discussed a lot on IRC.
  9. Fairly early in the OC-ARM development process, the question of component IO came up. I spent a good bit of time doing research and checking my assumptions. Then, I set out to create an interchange format with the following properties: Trivially maps to every object type OpenComputers itself allows (implied by the above) Trivially maps to the Lua representation Feasible to implement in hardware Simple to implement in software Independent of architectural details Compact This is what came out. There are two versions of this format: packed and unpacked. Packed is more compact, and is suitable for wire transmission or 8-/16-bit architectures. Unpacked is simpler to deal with on 32-bit architectures. (The packed representation, not coincidentally, takes up the same amount of space as is calculated for network packet size limiting purposes.) Applications communicating over the network: SHOULD use the "pickled" representation from the Lua `serialization` API, when possible (for compatibility with Lua) SHOULD use the packed representation with network byte order, when the above is not possible Architectures implementing a component IO medium:MUST support the use of the "packed" representation MUST support native byte order SHOULD have optional support for network byte order, if that is different from native MAY have optional support for "unpacked" IO, if this makes sense on the platform A producer is a program that is producing Interchange Values. A consumer is a program that is consuming Interchange Values. An Interchange Value is a type tag followed directly by the data. In the packed representation, tags are 16-bit. In the unpacked representation, they are 32-bit and sign-extended (such that 0xFFFE becomes 0xFFFFFFFE), but still limited to the same range). Tag 0x0000 - 0x3FFF: ICTAG_STRING A UTF-8 string, whose byte length is the low 14 bits of the tag. In the unpacked representation, the string is padded to a multiple of 4 bytes; padding is not reflected in the length. No NUL terminator is required, and NUL may freely exist within the string, but producers SHOULD avoid producing strings containing embedded NULs. Consumers MAY (but SHOULD NOT) process embedded NUL as a NUL terminator, and discard the portion of the string after it. If there are invalid UTF-8 sequences in the string, they MAY be lost during subsequent processing. Consumers that convert to another representation (such as UTF-16 for Java storage, or Unicode code sequences for display) SHOULD discard invalid UTF-8 sequences. Producers about to output an ICTAG_STRING that contain an exact, valid UUID value with lowercase digits SHOULD instead produce an ICTAG_UUID. Tag 0x4000 - 0x7FFF: ICTAG_BYTE_ARRAY An array of byte values with no particular semantics, whose length is the low 14 bits of the tag. In the unpacked representation, the string is padded to a multiple of 4 bytes; padding is not reflected in the length. Consumers that expect a string MAY treat an ICTAG_BYTE_ARRAY as an ICTAG_STRING. Consumers that expect an ICTAG_BYTE_ARRAY (e.g. disk IO) may only treat an ICTAG_STRING as an ICTAG_BYTE_ARRAY if it is still available in its original serialized form, not having made a round-trip conversion. Tag 0xFFF8 / -8: ICTAG_UUID A 128-bit UUID. Consumers that expect a string MUST convert an ICTAG_UUID into its canonical ICTAG_STRING equivalent, using lowercase digits. Note: Consumers that expect a UUID are NOT required to accept a well-formed ICTAG_STRING in its place. Tag 0xFFF9 / -7: ICTAG_COMPOUND A list of key-value pairs stored key first, value second, terminated by ICTAG_END. Any type but ICTAG_BYTE_ARRAY, ICTAG_COMPOUND, ICTAG_ARRAY, and ICTAG_NULL may appear as a key. Any type may appear as a value. If a consumer encounters an ICTAG_END where a value should go, the entire Interchange Buffer MUST be discarded as invalid. Tag 0xFFFA / -6: ICTAG_ARRAY A list of Interchange Values, terminated by ICTAG_END. Any type may appear as an element of an array. Tag 0xFFFB / -5: ICTAG_INT A signed, two's-complement 32-bit integer. Tag 0xFFFC / -4: ICTAG_DOUBLE An IEEE 754 64-bit double. Producers SHOULD convert doubles with exact signed two's-complement 32-bit integer values to ICTAG_INT. Consumers that expect an ICTAG_DOUBLE MUST be able to process an ICTAG_INT in its place, and MUST NOT treat it differently from an ICTAG_DOUBLE with the same numerical value. Tag 0xFFFD / -3: ICTAG_BOOLEAN Either true or false. In the packed representation, it is a single byte. In the unpacked representation, it is 32 bits. Zero is false, any non-zero value is true. Producers SHOULD produce all-bits-set for true. Consumers MUST treat any non-zero value as equivalent to all-bits-set. Tag 0xFFFE / -2: ICTAG_NULL A strongly-typed NULL, equivalent to Java's null and Lua's nil. Tag 0xFFFF / -1: ICTAG_END Signifies the end of an Interchange Buffer, array, or compound. All other values are reserved. An Interchange Buffer that contains any other tag value MUST be discarded as invalid. An Interchange Buffer simply consists of zero or more Interchange Values, terminated by ICTAG_END. Issues Should there be more integer/float types? e.g. ICTAG_SHORT, ICTAG_BYTE, ICTAG_LONG, ICTAG_FLOAT... No. This would make writing simple processing code difficult. ICTAG_DOUBLE fully covers Lua's number range, and covers a significant portion of Java's number range as well. ICTAG_INT covers a sufficiently wide range to remove the need for floating-point math in the vast majority of IO operations. 8-/16-bit architectures cannot process 32-bit numbers easily. However, consumers on those architectures can, situationally, consume only the precision they expect, and gracefully fail when given too-large integers. (The situation would be far worse if they were expected to process an ICTAG_DOUBLE.) Should converting integer-valued ICTAG_DOUBLE to ICTAG_INT be mandatory? No. In situations where only integers are expected, but the original representation supports only "reals", transparent conversion is valuable. However, in situations where non-integer values are common, the conversion does not add value. It should be up to the generator to determine whether it makes sense to do the conversion. Should converting from UUID-valued ICTAG_STRING to ICTAG_UUID be mandatory? No. If, as is the usual case, the generator knows that subsequent use of the string as a UUID is unlikely, it can avoid the overhead. Should accepting ICTAG_UUID in place of ICTAG_STRING be mandatory? and Should accepting ICTAG_INT in place of ICTAG_DOUBLE be mandatory? Yes. In situations where the semantics of the values are not known, it is desirable to have "automatic simplification". "Automatic simplification" can only work if it can be expected to be reversed just as automatically when the conversion was not helpful. Should there be ICTAG_TRUE and ICTAG_FALSE instead of a single ICTAG_BOOLEAN? Unresolved. How is an ICTAG_UUID structured? This is important with byte orders other than network byte order. Unresolved. Possibilities: A sequence of bytes, in display order. No swapping is performed or necessary. (as in Java) A pair of 64-bit integers. (as in MFC) A 32-bit integer followed by three 16-bit integers followed by 6 loose bytes. Why have both ICTAG_BYTE_ARRAY and ICTAG_STRING? Why not just have one or the other? Java Strings are encoded as UTF-16. Interconversion between UTF-8 and UTF-16 is pretty simple, but still is overhead that isn't necessary for tasks like disk IO. In addition, there is no obvious way to preserve invalid UTF-8 sequences such that code that wishes to deal with byte arrays (like disk IO) can get the original byte sequence after a round-trip conversion trip through UTF-16. Creating such a system as part of the specification may be desired, but would then allow "invisible overhead" to creep in. Common Optimizations (This information is only useful to architecture implementors.) OC-ARM uses this format for component IO. It provides two facilities that usefully reduce the overhead of this system. Byte array IO return: When a program expects an IO operation to return only a single byte array, it can determine the size of this byte array by reading the size needed to store the Interchange Buffer and subtracting overhead. It can then use a special method to (attempt to) read the bytes directly into memory, rather than into an Interchange Buffer. Truncation: In situations (such as certain types of signal processing) where only a few values are important, the program can specify that it is interested only in the first N values before storing the Interchange Buffer. It can then proceed normally; extra values are simply discarded. This is useful in extremely memory-restricted situations.
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use and Privacy Policy.