Sunday, May 6, 2007

Unicode

Unicode allows us to represent character sets from many languages so that programming and web use can be international. We can enter Unicode characters in Microsoft Windows. Alan Wood includes many Unicode resources.

We can use the Character Map utility in Windows to add characters. For example in the Courier New font Unicode 03B1, Greek alpha is α, and Unicode 0416, Cyrillic Zhe, is Ж. Using the SimSun font, the Unicode character with code 8EBF is 躿. The Unicode value needs to be stored in memory. An encoding specifies how a value is stored. The Notepad editor provides four encodings, ANSI, Unicode, Unicode big endian, and UTF-8. ANSI(American National Standards Institute) developed a standard for English characters that uses 7 or 8 bits which can represent 128 or 256 characters using one byte.

To include more characters Uncode uses 16 bits or two bytes for its values. But with two bytes which byte comes first? The letter A in Unicode hexadecimal is 0041. The two bytes are 00(00000000 as bits) and 41 (01000001 as bits). Do they go in memory as 00 41 (little endian) or 41 00 (big endian)? Unicode files include two starting bytes to tell the difference. The normal big endian Unicode file starts with FE FF. Little endian systems reverse this to FF FE.

Notice that English characters like A need only one byte so why should we double the space to use Unicode for English? If all we use is English we could stick to ANSI. But suppose we sometimes need to include a foreign phrase. Even closely related languages such as Spanish and French have accented characters. Greek is used for mathematical symbols too. UTF-8 is a scheme using a variable number of bytes so that ANSI characters still use one byte but other Unicode characters use two or three bytes. We only use extra space when we need it. Unicode 0001 to 007f uses one byte. Unicode 0080 to 07ff uses two bytes with 110 starting the first byte and 10 starting the second. Unicode 0800 to ffff uses three bytes with 1110 starting the first bytes and 10 starting the second and third bytes.

Looking at the Notepad encodings for the Chinese 8EBF character shown above we get
ANSI 63
Unicode 255 254 191 142 (Windows is little endian)
Unicode big endian 254 255 142 191
UTF-8 239 187 191 232 186 191

8EBF in binary is 1000 1110 1011 1111. As two bytes it is 10001110 10111111.
In decimal 10001110=128+8+4+2=142
10111111=255-64 =191
so 8EBF is 142 191 in decimal.
ANSI saves the last 7 bits, 011 1111 = 32+16+8+4+2+1 = 63
(which is not appropriate here, but works for English)
Unicode on Windows uses little endian so it puts FF FE first which is 255 254 and
follows with the least signficant byte first, 191, then 142.
Unicode big endian put FE FF first which is 254 255 in decimal then the most significant byte 141 first followed by 191.

For UTF-8 we break up 8EBF (1000111010111111) as 1000 111010 111111 and create three bytes 1110 1000 = 128 + 64 + 32 = 232
10 111010 = 128 + 32 + 16 + 8 + 2 = 186
10 111111 = 255 - 64 = 191
The FE FF also needs to be coded as UTF-8. FEFF in binary is 1111 1110 1111 1111. We divide it as 1111 111011 111111 and the UTF-8 is
1110 1111 = 255 - 16 = 239
10 111011 = 255 - 64 -4 = 187
10 111111 = 255 - 64 = 191

Labels:

Friday, April 27, 2007

Game Computing

Game consoles use high-end processors and graphics cards to provide the best game experience. The Xbox 360 now allows hobbyists to program and share games with XNA Game Studio Express

Labels: ,

car computers

Cars have certainly gotten more complex. Computers help control emissions and avoid a proliferation of wires to each component. See How Car Computers Work"

Labels:

Thursday, April 19, 2007

MIDI Files

MIDI does not sample the audio signal. It transmits codes that tell what note to play for how long and at what volume. The MIDI is the language of gods site has MIDI information and programs to manipulate MIDI files which are binary and cannot be read in a text editor. The MIDI File Disassembler/Assembler converts from MIDI to text or text to MIDI.

Bach's two-part invention 7 can be played and converted to text. We can edit the text file to make changes in the music and convert back to MIDI to play the changed piece. For example adding "| transpose = 12" to the Track 1 heading changes each note by trasposing it 12 tones higher. Their are 12 tones in an octave so this command will raise the sound ond octave higher. View the text and play the MIDI file.

The tempo is fast but changing BPM from 107 to 60 and omitting micros\quarter=555556 will slow it down. View the text and play the MIDI file. We can also change the pitch of notes. For fun I changed F#4 to A5 by creating the file pitchdata containing the line
F#4 = A5
and adding the command
| map="C:\Documents and Settings\art\Desktop\MIDI\pitchdata"
to the heading of track 1. View the text and play the MIDI file.

For reference look at the sheet music. All MIDI files are binary, but using a program we can display the numerical bytes, with annotations added, of the original MIDI file.

Labels:

Wednesday, March 21, 2007

The Game of Life

The brilliant mathematician John von Neumann has his name attached to the architecture of the stored program computer. He was involved in the design of the first digital computers. He tried to find a machine that could reproduce itself. John Conway in 1970 simplified von Neumann's ideas and developed the Game of Life. This version of Life may work better. See also Conway's Game of Life.

Stephen Wolfram in A New Kind of Science shows how cellular automata like the game of life generate many complex processes. He and others believe that the universe may be a form of a cellular automaton. Coincidentally a new exhibit, "The Way of the Artist," at Cal State Fullerton relates to A New Kind of Science. The Orange Country Register article Mysterious Principles of Glass Art
tells about the doctor curator Barry Behrstock who relates the idea that simple rules underlie complex patterns to the art of Richard Marquis. Behrstock ties it all together with the Sierpinski triangle that he wears as a pendant.

A Wolfram video explains how the universe might come about from a network of cells. Ray Kurzweil reflects on A New Kind of Science. Edward Fredkin, a founder of cellular automata concepts, provides A Digital Philosophy. We can explore one-dimensional cellular automata, including the rules numbered by Wolfram. Rule 110 is interesting because it is capable of universal computation. Such simple computational systems might be found in nature.

Labels: , , , ,

Thursday, March 15, 2007

Paint.NET

Paint.NET is free imaging and photo manipulation software for Windows computers. This tutorial shows how to draw, use layer, and enhance a photo. Another tutorial covers layers, effects, and blend modes.

Labels: , ,

Tuesday, March 6, 2007

The New Science of Networks

How is the World Wide Web organized? How are networks of friends organized? The interesting book, Linked: The New Science of Networks, by Albert-Laszlo Barabasi, presents the new ideas that he pioneered. This review provides a good summary.

The earliest studies of networks assumed that they were organized randomly with each node having about the same chance as any other to make a connection to another node. In fact many networks are organized quite differently. They are scale free.

To contrast this compare a highway map of the US to an airline route map. Some cities have more roads and others less, but most have about the same number of roads to and from them. The difference between the largest cities and the smallest isn't thousands or millions. There is an average numbers of roads per city and most cities don't deviate by much from the average. The distribution is a bell curve.

An airline route map shows major hubs, the largest of which have hundreds or thousands of routes, whereas the smallest cities may have only a few flights. The distribution of the number of cities with a given number of links follows a power law, with many cities having few routes and few cities having many routes. The latter has no scale, no reference point like the average number of highways to a city. Power laws ... is a mathematical paper, but the first part has some nice diagrams comparing the graph of male height with that of city population. It also has some nice examples of power-law (scale-free) distributions such as word frequency (Moby Dick words and frequency table).

Social networks exhibit interesting patterns. On the Gallery of network images, check out high-school dating, Les Miserables, Websites, and Books on Politics. Mark Granovetter showed that weak ties, links to acquaintances rather than friends, are more helpful in getting jobs that are strong ties.

The small-world phenomenon is the hypothesis that everyone in the world can be reached through a short chain of social acquaintances, illustrated by the Six Degrees of Kevin Bacon game. From here visit the Oracle of Bacon, the Center of the Hollywood Universe, and the 1000 best centers sites.

The structure of real networks is governed by two principles: growth and preferential attachment. Growth, where nodes and links are continually added, contrasts with a static network where the nodes and links are mostly fixed in advance. Preferential attachment can be describe as the rich get richer. Nodes, when deciding where to link, prefer the nodes that have more links. The scale-free structure that develops has some interesting properties. It is very robust, meaning that random failures will not much affect the overall functioning of the network. But it is also quite vulnerable to terrorists who targest major hubs. See
Scale-Free Networks and Terrorism
.

Labels: , ,