Jump to content
  • Sky
  • Blueberry
  • Slate
  • Blackcurrant
  • Watermelon
  • Strawberry
  • Orange
  • Banana
  • Apple
  • Emerald
  • Chocolate
  • Charcoal
  • 0
Ta©ti Tac0Z

binary read and write, how does it work exactly?

Question

i was programing my program when i found out that 'rb' and 'wb' doesn't work as it works in computer craft i didn't think of that

in computer craft useing the 'wb' the handle.write function will expect a number insted of a string witch makes you able to write files bytes after bytes how does that work in open computers?

the file handling work a lot defenc betwine the two mods

simple qestion is: how does binary read and write work in opencomputers

 

(given that cc uses acill - or what ever its called exacly - while oc uses unicode, right? so doesn't that make it kindof defrent in evrey way?)

Link to post
Share on other sites

4 answers to this question

Recommended Posts

  • 0

A byte is an octet of bits, which can store 2⁸ = 256 possible values. To represent a character as a number, we need an encoding system. ASCII is one of them, and it assigns first 128 values to control characters, digits, English letters and punctuation. But people also use different languages, which need additional characters to be represented. Latin-1, for example, uses the remaining 128 values to define a few more characters from Latin-based scripts. Even with that, the character coverage was low: Chinese has several thousand different characters, for instance. Therefore, that was a need for an encoding system that allows to represent, ideally, characters from all scripts in use.

Unicode is such a standard, and its latest revision has more than 130 thousand characters mapped to codepoints. Unicode has several possible encodings, e.g. UTF-8, UTF-16, and UTF-32, also defined by the Unicode standard. UTF-16 uses 16 bits (2 bytes) for each character, and UTF-32 uses 4 bytes. But they occupy too much data storage. UTF-8 has gained more use, at least on the web, and it uses 1 to 4 bytes to represent each character (so it usually uses less space). It's also the encoding used by OpenComputers.

This is how UTF-8 encodes the string "¯\_(ツ)_/¯":

Byte:  c2 af 5c 5f 28 e3 83 84 29 5f 2f c2 af
Chars: \___/ \/ \/ \/ \______/ \/ \/ \/ \___/
         ¯    \  _  (    ツ     )  _  /   ¯

As you can see, some characters are encoded with a single byte (such as _, (, ), /, \); some need 2 bytes (¯); the tsu character in the middle (ツ) needs 3 bytes.

When you open a file, there's a question of what it should work with. There are two possible answers.

  • Non-binary. The stream works with UTF-8-encoded characters. Assuming the stream is at the beginning of the string "¯\_(ツ)_/¯", file:read(2) will return "¯\" — that's 3 bytes (#"¯\" == 3). This is the default mode.
  • Binary. The stream works with single bytes. With the same assumption, file:read(2) will return "¯", a single character. If you need to use it, you have to explicitly set it by adding the character "b" to the mode string (io.open(path, "rb"), for instance).

In binary mode, CC is unable to store non-UTF-8-encoded data in its strings (in other words, it does support strings encoded with UTF-8, but not raw bytes), so it has to return a number instead. This is not how PUC-Rio's implementation of Lua, or OC's natives (which are based on the former) work — they return a string with a single byte instead. But you can get the byte number by using the function string.byte, and string.char to do the opposite.

local str = "\xc2"
local byte = string.byte(str)

print(byte == 0xc2)  --> true
print(string.char(byte) == str)  --> true

 

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use and Privacy Policy.