The OTR Database Project

Cataloging the Collection

I've been collecting Old Time Radio shows in one way or another since 1970. Over the decades these shows ended up on a variety of media: reel-to-reel tape, audio cassette, VHS tape, vinyl, and digital formats of all sorts stored on hard drives, Compact Discs, DVDs, and even backup tape.

Since the early 1980s I've been trying to put together a database of the collection and even that has gone through changes:

  • index cards which show information and storage location
  • a custom database I wrote for the Apple ][ computer in 1982 or so
  • a spreadsheet built in Microsoft Works and later in Microsoft Excel
  • a database in dBase4
  • back to a spreadsheet, this time in OpenOffice Calc
  • the latest is a custom, web-based application using the MySQL database and Apache web servers, written in the PHp language

Yes, I know there are some programs out there for this type of thing but as a do-it-yourselfer, I wanted to "roll my own."

The web-based database has, as of this writing in March 2015, over 80,000 entries. Not all entries are OTR shows as it includes other OTR-related material such as images, text files, and some videos. The number of shows is around 60,000.

It has been, to say the least, a time-consumming project. Just gathering the media into one place was a huge task and then to enter everything into the database took, well, months.

I first had to copy everything that is in a digital format to an external hard drive. I used an external drive because it makes the collection portable. Most of the material was on CD or DVD meaning that each disc, over 600 of them, had to be inserted into the computer and copied. While this took a few weeks, I did make sure that each disc was copied into a directory named to match the disc number I had assigned. This would make finding the original file easier in the future.

The next job was to design the database tables, that is, what information did I want to save for each show? This is what I came up with:

  • show name
  • title
  • description
  • date
  • location id (this is a number that ties it back to the source disc number)
  • genre
  • quality (of the recording)
  • volume (the plan is to use this to separate pre-1963 shows from those that came after)
  • path (directory where the file is located)
  • filename
  • file extension
  • time (playing time if audio or video)
  • special project (collections of shows under a theme)

For the development of this project, I used XAMPP, a bundle of programs that run on my desktop computer and provide an Apache webserver and a MySQL database server. While I could utilize the ones I have access to online, I wanted to keep things local, plus it eliminates the "ftp" step when changing files and I can access all the shows locally and not have to upload them (many gigabytes) during testing.

I wrote a short program in PHP that created the database and built the tables I needed.

Next, I had to create an entry for each item in the database. While it would have been nice, not mention convenient, if a simply utility could be developed to read all the show data from the show filenames, there is not a consistent filenaming scheme that the OTR world adhers to. That means that every show has to dealt with individually to get its information into the database. I did this in several steps.

In the first step, I wrote a program that, recursively, went through the entire file structure of where the shows were stored. It retrieved the volume, path, filename, and file extension for each item, all 80,00-plus of them, and entered that data into the table in the appropriate fields of the database. This worked great and saved me lots of effort.

Since many of the programs were "collections," that is, all the same show in one folder with an consistent file naming format, I wrote software that would scan that collection and disaggregate the filename into show name, title, date, description, or any other data I could pull from it. This data was then, auto-magically, entered into the database for that show. This utility program would then be tweaked to suit each of the various filename configurations. In the end, I wrote about 900 variations of this utility.

Once in a while, or more oftern than that, I would encounter a directory that had a bundle of files each with a different naming convention. These had to be entered manually into the database table. I used the phpMyAdmin utility, part of XAMPP, to do this.

Once I had everything entered, I went back through the table and edited show names so they would be consistent. For example, over the time I worked on this I had ended up with enteries like this for the same show:

Adventures of Sherlock Holmes
The Adventures of Sherlock Holmes
Adventures of Sherlock Holmes, The
Adventures of Sherlock Holmes,The
Sherlock Holmes

These would be seen as different shows when doing search so I wrote a utility that renamed them to a common name, in this case:

Adventures of Sherlock Holmes, The

The next push was to have a consistent entry for "genre" for each show. I used the Dunning book, "On the Air," to decide what to assign to each of the shows. I wrote a utility that scoured the table for each show of a given title and the genre field would be updated.

I had created a field where I can enter the playing time for each show. I added this because it is sometimes handy to look for a show of a particular length or one between a range of lengths. I found a utility online that will examine a media file and return all types of information about it. I reworked the code to only return the playtime of a sound file (any format) and put this function inside a loop that looked at every file in the database and then entered its play time. Pretty slick and convenient.

After a bit more "clean up" I wrote the front end, the search queries, and the results and player pages.

The application allows full search capability by show name, titles, dates, file type, genre, etc. and can also be searched by wildcards. When using this on my local, wi-fi network, the search results can be played or viewed on demand, streaming to any device. A very useful feature allows me to build custom playlists that can streamed to a desktop computer or sent to a folder where the material can be burned to a CD or DVD.

Once all that data wass stored in a database, it's fun thinking up what I might do with it in addition to playing the shows. This graph shows the number of dated shows I have by year. I was curious as to whether there was a trend and the graph indicates that there is. I don't have an explanation but it does show that the peak year was 1949.