Ruby Programming/ASCII

To us, a string such as " " looks like a series of letters with a space in the middle. To your computer, however, every String – in fact, everything – is a series of numbers.

ASCII
In our example, each character of the String " " is represented by a number between 0 and 127. For example, to the computer, the capital letter " " is encoded as the number 72, whereas the space is encoded as the number 32. The ASCII standard, originally developed for sending telegraphs, specifies what number is used to represent each character.

On most Unix-like operating systems, you can view the entire chart of ASCII codes by typing " " at the shell prompt. Wikipedia's page on ASCII also lists the ASCII codes. Using an ASCII chart, we discover that our string " " gets converted into the following series of ASCII codes.

H e   l   l   o   space  w   o   r   l   d 72 101 108 108 111 32     119 111 114 108 100

You can also determine the ASCII code of a character by using the  operator in Ruby 1.8.

puts ?H puts ?e puts ?l puts ?l puts ?o

The question-mark syntax no longer works in Ruby 1.9. Instead, use the ord method.

puts "H".ord puts "e".ord puts "l".ord puts "l".ord puts "o".ord

Notice that the output (below) of this program matches the ASCII codes for the " " part of " ".

$ hello-ascii.rb 72 101 108 108 111

To get the ASCII value for a space, we need to use its escape sequence. In fact, we can use any escape sequence with the  operator.

puts ?\s puts ?\t puts ?\b puts ?\a

As above in Ruby >= 1.9 use

instead.

The result: 32 9 8 7

Terminal emulators
You may not realize it, but so far, you've been running your Ruby programs inside of a program called a terminal emulator – such as the Microsoft Windows console, the Mac OS X Terminal application, a telnet client, rxvt, or  X Window System programs such as xterm.

When your Ruby program prints out the letter " ", it sends the ASCII code for " " (72) to the terminal emulator, which then draws an " ". When your Ruby program prints out a bell character, it sends a different ASCII code – ASCII code 7 – to the terminal emulator. In this case, the terminal emulator does not draw a symbol, but instead will typically beep or flash briefly. How each of the codes gets interpreted is largely determined by the ASCII standard.

Other character encodings
The ASCII standard is a type of character encoding. As mentioned above, ASCII only uses numbers 0 through 127 to define characters. There's a lot more characters than that in the world. Other character encoding systems – such as Latin-1, Shift_JIS, and the Unicode Transformation Format (UTF) – have been created to represent a wider variety of characters, including those found in languages such as Arabic, Hebrew, Chinese, and Japanese.