Binary music

Overall, there is a lot of information stored on our computers. Take a new machine, straight from the box, and install an operating system on it. That might take several GB of space. That’s quite a vast quantity of bits. Why not do something productive with them (apart, of course, from having them run your computer)?

Not wanting to let all these bits and bytes go to waste, just sitting there and most of the time doing nothing, I decided to try and produce music from them.

What separates music from the normal sounds of everyday life, the clanks and clatters of objects which we would not dare call “musical instruments”? One very important aspect is form and coherency. Music is never a random jargon of notes, durations, intervals, and frequencies. It has form, motives, recurring elements and rules which define what is allowed, and what isn’t; otherwise, it’s just noise. For example, most classical sonatas will always follow the same structure, the themes exposed in relatively similar ways,  the fundamental harmonic moves much alike.  Otherwise, they would not be called “sonatas”. Much, much more can be said upon this topic,  but this is not the place to do so; let us just conclude, that music is not random noise, but follows certain laws, schemas, and patterns. These patterns are part of what makes music distinct, when compared to what we find annoying or undesirable noise.  I recommend you all to contemplate further on the subject, the next time your favorite song is on the radio; in any case, I may write a deeper note on this in the future.

Coincidently, the binary data stored on our disks isn’t random either. It too follows form and structure. For example, all of Windows’ DLLs and executables start with a section called the PE (Portable Executable) header. This header contains much information that the operating system needs to know about the binary code; for example, how to import its required dlls, or how to map its various code segments into memory pages. The PE header never looks like random data, but always carries the same general look about it, with a little variations between different files. You can check this out yourself with a hex-editing tool.

However, that is not the only part of our computers’ files that behave in an ordered manner. The code itself, the assembly, also follows certain rules; it has structure. Generally, we do not write programs that invoke random methods; we tend to use the same recurring flow control structures (if, for, while…), methods, error checks, and many more. All this means that we have some order in our high level code, and eventually, in our assembly. This is strengthened by the fact that the compiler itself, when generating opcodes from our high level languages, tends to use the same tricks when compiling. Furthermore, the opcodes themselves, as one can see in instruction sets, are not random, but have some sort of order about them.

So, all in all, we have relatively ordered and structured data sitting idly on our hard disks. Can any sort of music be created from it? I wrote a very simple python script that reads the contents of a given file, and uses the windows api Beep function in order to generate a sound according to that data.

In this case, the script goes over every pair of bytes, and plays a sound with a frequency that linearly depends on the first of the pair, and duration that linearly depends on the second. However, it is very fun to test some more complex cases, such as using exponents, square roots, or whatever combination of mathematical operators that we like.

Attached here are three examples of the output of the script. The first 8 seconds of each are pretty much the same, but the rest is rather diverse and interesting to hear (you may need to turn your speakers up a bit).  The three files are the first 300 notes of Notepad.exe, Starcraft.exe, and ntdll.dll:

Notepad.mp3
Starcraft.mp3
ntdll.mp3

It’s nice to see what effect the numerous zeros have, as well as the well defined structure of the PE header. Even with the simple linear relation, we can see the structure of the binary data, and get something which pretty much sounds like music (reminds me of Schönberg for some reason). I’m sure that if we were to use a better sound source than the internal PC speaker, and if we were to think of a better mathematical relation between the file’s content and the sounds produced, we could get some good music out of it. We might try to use the files to create several voices, for example, creating a “binary symphony”; or use several bytes of data at a time in order to create overtones. The options are endless, and eventually we get quality music, while not actually composing anything of our own. Thank you, software developers!

The code:

import sys
import win32api

DEFAULT_NUMBER_OF_SOUNDS = 300

def parseArgs(arguments):
    """
    Gets and parses the arguments the user gave.
    In case the user gave incorect arguments, prints usage and exits.
    """
    USAGE = r"""
Usage:
 FileToRead [NumberOfNotesToPlay]
Example: Beep.py c:\windows\notepad.exe 300"""
    LEGAL_ARG_NUM = [1,2]

    argNum = len(arguments) - 1
    if argNum not in LEGAL_ARG_NUM:
        print USAGE
        sys.exit()

    fileNameToRead = arguments[1]
    try:
        numberOfSounds = int(arguments[2])
    except IndexError:
        numberOfSounds = None
    except ValueError:
        print "The number of notes to play should be an integer, \
but you gave '%s'" %arguments[2]
        sys.exit()

    return (fileNameToRead, numberOfSounds)

def playFile(fileToRead, numberOfSounds = None):
    if numberOfSounds == None:
        numberOfSounds = DEFAULT_NUMBER_OF_SOUNDS
    contents = open(fileToRead).read()

    try:
        for i in xrange(numberOfSounds):
            playSound(contents, i)
    except IndexError:
        pass

def playSound(contents, index):
    freqNum = ord(contents[2*index])
    durNum = ord(contents[2*index + 1])
    freq = 40 + freqNum * 2
    dur = 100 + durNum / 2
    win32api.Beep(freq, dur)

def main(arguments):
    fileToRead, numberOfSounds = parseArgs(arguments)
    playFile(fileToRead, numberOfSounds)

if __name__ == "__main__":
    main(sys.argv)
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s