The Google Problem

Has anyone noticed Google becoming less user-friendly lately? It started with the heavy domination of blogs in search results, and worsened with the loss of numbered results recently. It worsened further when Google started failing to find results that you KNOW are there because you had found the pages searching Google for other terms. At the same time, Google’s image search changed from searching for images with the search term in the description or name to showing a subset of images from any web page which contains the search term, and the image results bar changed in structure to occupy more of the screen real estate. (NB: I detected these changes and imperfections because Google is still my first choice search engine, so take the criticism with a grain of salt; I’m not saying it’s hopeless, just worse than it was).

Sometime during or around these changes, Google’s ‘about us’ pages were redesigned or restructured to remove any capacity for feedback direct to the company – the best you can do is post to a number of google groups that are about google, and which may or may not receive any attention from the company itself. [I'm told that the staff do monitor at least some of these fairly closely, but their website makes no guarantees in that regard].

Google has begin to look like just another faceless corporation, keeping its customers at something greater than arm’s length. I guess that’s the bad news. The good news is that if you are searching for blog content, you have a high chance of finding something relevant; and if you are searching for a particular image, you have a fair chance of finding something interesting that has little or nothing to do with the original subject of your inquiry.

Not all the changes have been bad, either – the ability to view a subset of the images found by size can be very useful, though that had been around for a while – it’s simply expanded from three size categories to four.

I would have been far happier if you could turn search result numbering on or off in your preferances, and if you could include or exclude blog results with the click of a radio button, and if your image search could be restricted to a literal search (the way it used to be) by the click of another radio button. And if it still found everything!

The effect of these changes is that it is far harder for DMs – or the general public – to use Google as a resource to find what you are looking for. You have to know the operations of the search engine to a degree that was never previously the case. Trying to find what you want is more and more a question of wading through mountains of irrellevancy and bloat – with less to visibly differentiate one screenfull of links from the next.

The RPG Problem

The same thing often happens with rule systems. When the core rules come out, it’s relatively easy to find most of the things you are looking for (though something is usually placed in a strange position somewhere!) The indexes are usually less than helpful, but that’s not surprising – I know from experience that an index takes at least as long to generate as the text being indexed did to write, and if I have to choose between an auther spending time compiling a perfect index and the author polishing content until the last possible second, I’ll pick the second choice every time.

But as new expansions and supplements come out, both official and third-party, it gets harder and harder to know where to find what you want. I have six D&D supplements on planes, planar travel, planar gates, etc – I have to go through them all each time I’m looking for something in particular within the subject. And since they are by at least three different publishers, their indexing schemas are all different, as well.

By the time you factor in hundreds of mini-supplements downloaded from various websites, and saved web pages and extracts from web pages, and my own writings on any given subject, and the content of webzines like Roleplaying Tips, and my various magazine collections of relevance, there may as well not be an index. The best you can hope for when searching by keyword – the desktop equivalent of an internet search – is that you’ll find something vaguely related to the specific subject you are looking for.

Just like Google.

I can see no reason why there can’t be a solution to the rules indexing problem, though – or in fact to compiling a complete index to all printed works in anyone’s collection. I can even envisage the design of such a solution.

The RPG Solution?

It starts with each publisher compiling the indexes to their various rules supplements into a single, downloadable, database, and making that download free from their website. This is not as difficult as it sounds: most indexes are generated by “tagging” key words and phrases as ‘index entries’ within the document while it is being written; these then automatically generate the page number that the referance appears on in the index, updating it when content is moved or rearranged. Adding an option to export the index – or coming up with an additional piece of software to extract them – would not be a major headache. These would need to be in some industry-wide fixed format.

That’s so that a dedicated piece of software can read in all the index entries for all the supplements that the user has indicated in the software’s settings that they own, compile them all into a single BIG index, sort it alphabetically, and generate a virtual index – one that can be printed out if that’s what’s desired, or saved as an ordinary document file. It would give the title of the source document and the page number.

A second piece of software could be used to generate index entries for the thousands of files on the computer that have been downloaded from places like RPGNow. The software only has to ignore certain common words like “a” and “the”, and to have a list of other words that need to be associated with another word to form a complete term – so that “silver” is not an index entry, but the software adds the next word to get “silver shield”, “silver bullet”, and so on. If the next word is one of the first group of common words, then “silver” stands alone as a meaningful index entry. Search engines – like Google – have been able to do this for ten years now. The result is an index of all the content on the user’s hard disk in the same format as the official indexes provided by the Game Publishers, and which can be read in by the first piece of software just like any other index.

Why might the game companies do this? Perhaps because it can be set up to generate sales. If you can compile an index of ALL the rpg supplements out there, then you can query that index, telling the software to ignore the supplements you already own – and quickly discover which volumes from which publishers you should add to your collection to get information on “Fey weapons” or “The Dreamtime” or “Moon Rockets” or whatever it is that you are looking for. It’s a new service, and a new form of advertising at the same time.

This software would not be all that difficult to create. Rudimentary database and programming skills would be enough. If my understanding of it is correct, there’s even a piece of software already in existance – Tablesmith – that could be used to perform most of these tasks, given the input databases.

Will it ever happen? It’s not out of the question – but I wouldn’t hold my breath. Maybe some gamer could write it as freeware…. Wouldn’t it be great?

The Correlation Gap

I once read that human knowledge is expanding at ten times the rate at which information can be compiled and correlated and indexed, and that the sum total of human knowledge is still doubling every five years, something that it’s been doing since the 1990s [that’s the amount of information that there is for anyone TO know, not the amount that they DO know). I’ve also read that the internet is expanding at roughly twice the speed that search engines like Google can find and index the pages – and that was about ten years ago, before the whole Blogging phenomenon exploded, and before Myspace and Youtube. To solve this Correlation Gap, we need new and better tools for associating multiple sources of information with their content and relating multiple sources of information to each other, generating concordances as we need them.

In the meantime, it’s just going to get harder to find the information you need, as it is perpetually drowned out by an increasing overhead of semi-related and claims-to-be-relevent information. It used to be that the hardest taks was in filtering out results that were unreliable, but that’s no longer the case. It’s in trying to cut out the irrelevant that Google has come unstuck; it’s a difficult problem with no easy solutions, but they’ve thrown the baby out with the bathwater.

What good is information if you can’t find it?

