Google Groans: Misplacing the Rules

Written by Mike Bourke

Filed in Mike, Tools & Techniques on Jan.22, 2009

The Google Problem

Has anyone noticed Google becoming less user-friendly lately? It started with the heavy domination of blogs in search results, and worsened with the loss of numbered results recently. It worsened further when Google started failing to find results that you KNOW are there because you had found the pages searching Google for other terms. At the same time, Google’s image search changed from searching for images with the search term in the description or name to showing a subset of images from any web page which contains the search term, and the image results bar changed in structure to occupy more of the screen real estate. (NB: I detected these changes and imperfections because Google is still my first choice search engine, so take the criticism with a grain of salt; I’m not saying it’s hopeless, just worse than it was).

Sometime during or around these changes, Google’s ‘about us’ pages were redesigned or restructured to remove any capacity for feedback direct to the company – the best you can do is post to a number of google groups that are about google, and which may or may not receive any attention from the company itself. [I’m told that the staff do monitor at least some of these fairly closely, but their website makes no guarantees in that regard].

Google has begin to look like just another faceless corporation, keeping its customers at something greater than arm’s length. I guess that’s the bad news. The good news is that if you are searching for blog content, you have a high chance of finding something relevant; and if you are searching for a particular image, you have a fair chance of finding something interesting that has little or nothing to do with the original subject of your inquiry.

Not all the changes have been bad, either – the ability to view a subset of the images found by size can be very useful, though that had been around for a while – it’s simply expanded from three size categories to four.

I would have been far happier if you could turn search result numbering on or off in your preferances, and if you could include or exclude blog results with the click of a radio button, and if your image search could be restricted to a literal search (the way it used to be) by the click of another radio button. And if it still found everything!

The effect of these changes is that it is far harder for DMs – or the general public – to use Google as a resource to find what you are looking for. You have to know the operations of the search engine to a degree that was never previously the case. Trying to find what you want is more and more a question of wading through mountains of irrellevancy and bloat – with less to visibly differentiate one screenfull of links from the next.

The RPG Problem

The same thing often happens with rule systems. When the core rules come out, it’s relatively easy to find most of the things you are looking for (though something is usually placed in a strange position somewhere!) The indexes are usually less than helpful, but that’s not surprising – I know from experience that an index takes at least as long to generate as the text being indexed did to write, and if I have to choose between an auther spending time compiling a perfect index and the author polishing content until the last possible second, I’ll pick the second choice every time.

But as new expansions and supplements come out, both official and third-party, it gets harder and harder to know where to find what you want. I have six D&D supplements on planes, planar travel, planar gates, etc – I have to go through them all each time I’m looking for something in particular within the subject. And since they are by at least three different publishers, their indexing schemas are all different, as well.

By the time you factor in hundreds of mini-supplements downloaded from various websites, and saved web pages and extracts from web pages, and my own writings on any given subject, and the content of webzines like Roleplaying Tips, and my various magazine collections of relevance, there may as well not be an index. The best you can hope for when searching by keyword – the desktop equivalent of an internet search – is that you’ll find something vaguely related to the specific subject you are looking for.

Just like Google.

I can see no reason why there can’t be a solution to the rules indexing problem, though – or in fact to compiling a complete index to all printed works in anyone’s collection. I can even envisage the design of such a solution.

The RPG Solution?

It starts with each publisher compiling the indexes to their various rules supplements into a single, downloadable, database, and making that download free from their website. This is not as difficult as it sounds: most indexes are generated by “tagging” key words and phrases as ‘index entries’ within the document while it is being written; these then automatically generate the page number that the referance appears on in the index, updating it when content is moved or rearranged. Adding an option to export the index – or coming up with an additional piece of software to extract them – would not be a major headache. These would need to be in some industry-wide fixed format.

That’s so that a dedicated piece of software can read in all the index entries for all the supplements that the user has indicated in the software’s settings that they own, compile them all into a single BIG index, sort it alphabetically, and generate a virtual index – one that can be printed out if that’s what’s desired, or saved as an ordinary document file. It would give the title of the source document and the page number.

A second piece of software could be used to generate index entries for the thousands of files on the computer that have been downloaded from places like RPGNow. The software only has to ignore certain common words like “a” and “the”, and to have a list of other words that need to be associated with another word to form a complete term – so that “silver” is not an index entry, but the software adds the next word to get “silver shield”, “silver bullet”, and so on. If the next word is one of the first group of common words, then “silver” stands alone as a meaningful index entry. Search engines – like Google – have been able to do this for ten years now. The result is an index of all the content on the user’s hard disk in the same format as the official indexes provided by the Game Publishers, and which can be read in by the first piece of software just like any other index.

Why might the game companies do this? Perhaps because it can be set up to generate sales. If you can compile an index of ALL the rpg supplements out there, then you can query that index, telling the software to ignore the supplements you already own – and quickly discover which volumes from which publishers you should add to your collection to get information on “Fey weapons” or “The Dreamtime” or “Moon Rockets” or whatever it is that you are looking for. It’s a new service, and a new form of advertising at the same time.

This software would not be all that difficult to create. Rudimentary database and programming skills would be enough. If my understanding of it is correct, there’s even a piece of software already in existance – Tablesmith – that could be used to perform most of these tasks, given the input databases.

Will it ever happen? It’s not out of the question – but I wouldn’t hold my breath. Maybe some gamer could write it as freeware…. Wouldn’t it be great?

The Correlation Gap

I once read that human knowledge is expanding at ten times the rate at which information can be compiled and correlated and indexed, and that the sum total of human knowledge is still doubling every five years, something that it’s been doing since the 1990s [that’s the amount of information that there is for anyone TO know, not the amount that they DO know). I’ve also read that the internet is expanding at roughly twice the speed that search engines like Google can find and index the pages – and that was about ten years ago, before the whole Blogging phenomenon exploded, and before Myspace and Youtube. To solve this Correlation Gap, we need new and better tools for associating multiple sources of information with their content and relating multiple sources of information to each other, generating concordances as we need them.

In the meantime, it’s just going to get harder to find the information you need, as it is perpetually drowned out by an increasing overhead of semi-related and claims-to-be-relevent information. It used to be that the hardest taks was in filtering out results that were unreliable, but that’s no longer the case. It’s in trying to cut out the irrelevant that Google has come unstuck; it’s a difficult problem with no easy solutions, but they’ve thrown the baby out with the bathwater.

What good is information if you can’t find it?

Discover more from Campaign Mastery

Subscribe to get the latest posts sent to your email.

Tags: Game-Administration, Google, Rules-Bloat, Tools & Techniques

Comments (10)

10 Responses to “Google Groans: Misplacing the Rules”

C Rader Says:
January 22nd, 2009 at 5:24 pm
Speaking as a professional information correlation and retrieval artist, aka a librarian, I totally agree with your last statement, in fact, I make my living on it. What you describe is wonderful in it’s conception and the reason that it has not been done is the same reason that many assets have not been digitized, no one is getting paid to do it.

Indexing as you describe it is mechanical indexing and generally results in poor indexes as it is looking for word occurrence rather than word meaning. To take your example, “The Dreamtime”, meaning the shared mythological space of the aborigines of Australia, or Dream’s Domain, or that time during sleep when dreaming happens. Which one? Not to mention a mechanical index picking up, but not filing together, “Dreamtime, The”, “the dreamtime”,”The Dream Time”, etc. It takes human intervention to know a material and assign the correct semantic meaning to the term in use. Which is costly and that’s why good indexes are so valuable. Add in the fact of all the material printed before electronic replicas, and no, it’s not so easy to scan in pages and OCR them, not to mention the intellectual property rights that would have to negotiated, especially with companies with competitive products…

Even tagging as you describe it is a nice idea, but again, it takes time and adds to the cost. Indexing is a well established profession and is more difficult than you would think to produce a ‘useful’ guide to a work. On the scale you describe, well, it’s just not that easy.

In fact, the preponderance of electronic indexing has resulted in a glut of bad indexes to books and other materials, meaning it is adding to the volume of material you have to wade through.

There are other concepts that I have not gotten too, but rest assured that what you talk about has been done on non-rpg sources and it is expensive as all get out, Lexis-Nexis being one that comes right to mind as well as DIALOG, and even then we have to be constantly on the watch for term creep in order to keep things manageable.

For a detailed look at the scope of indexing, uses and construction, find “Indexing Books” by Nancy C. Mulvany (ISBN 9780226552767) at your library, and then give it a try on one of your non-indexed supplements.
Mike Bourke Says:
January 22nd, 2009 at 6:24 pm
Informative and educational comments, C, thank you. I don’t underestimate the difficulties involved, but would contend that on a restricted and narrow scope subject like RPGs, a ‘mechanical’ index would be better than nothing, or what we have at the moment. Nevertheless, nothing is ever as easy as we would like it to be!

I realised that it was possible that similar projects had been undertaken in the past, and was interested in your citing of specific attempts; what I considered more original thought was something you point out in your second paragraph, ie that it isn’t being done because no-one is being paid to do it. The potential as a marketing tool which could defray the costs is the original thinking, as I’ve never heard such a thing suggested in the past.

I also think that the ‘tagging’ costs would be less than those involved in producing a quality index of the type that a professional librarian would create, simply because most of the existing indexes would already be produced by tagging – that’s how every word processor I’ve ever heard of that’s capable of doing so, does it, and it’s also how windows help pages are compiled. So the extraction of a standalone index table from source documents would be relatively quick and easy. The index itself would be no more useful than the indexes are already in the existing products, but at least it would make it possible to compile them all into one place – meaning that you only had one document to consult, instead of having to check every index page seperately.

Again, thanks for contributing to the conversation. Hopefully, we can inspire some genius out there to solve the problem!
Mike Bourke Says:
January 22nd, 2009 at 6:29 pm
And one further thought: the companies most likely to fund the development of such software would have to be Google (obviously) or Amazon (not so obviously until the name is mentioned). If it’s at all feasable, I’m sure that SOMEONE will take advantage of the marketing opportunities – and potential sales – that result. Perhaps as a fee-based subscription service?
Trask Says:
January 23rd, 2009 at 4:13 am
Can’t help you with the rules problems, but I created a solution to useless Google results

http://www.rpgseek.com

Trask, The Last Tyromancer

Trask’s last blog post..Interview: Shane Ivey of Arc Dream Publishing
Mike Bourke Says:
January 23rd, 2009 at 10:46 am
Interesting idea Trask, but it doesn’t seem to work on my PC, possibly because I’m running an out-of-date OS. Nor does it seem to help me get information on, say, London’s sewer systems. A search for the term Elementals didn’t find any results either. Still, it’s a resource that I’ll keep my eye on, as it’s likely to find material that nothing else will. Thanks for sharing it with us all!
Joshua Says:
January 23rd, 2009 at 4:26 pm
Or you could consider running a system that doesn’t have so many rules and supplements that you wish you had Google to index it. I’m just saying.
Mike Bourke Says:
January 24th, 2009 at 12:39 am
Joshua, every rules system that I’ve ever come across either dies or expands with multiple supplements. Even then, I’m picky about what supplements I let into the game, cherrypicking those that fit the campaign, those that need a little tinkering to fit, and rejecting those that either don’t fit at all or would need a major rewrite to be compatable. Sure, running a system with fewer supplements would make it easier to find what I was looking for (and be a lot easier on my back); it would also inherantly limit the scope of the campaign. But that’s a subject for another blog one day.
Johnn Says:
January 27th, 2009 at 5:02 am
“Google has begun to look like just another faceless corporation, keeping its customers at something greater than arm’s length.”

In my recent experience, they have setup a number of blogs where readers and employees talk. The SEO blog, for example, seems to be a great place for SEO professionals to ask Matt Cutts questions and get responses. I was also a member of of an internal AdSense Group with publishers and Google AdSense employees. This doesn’t invalidate your point, but just an FYI.

I wish Amazon or some other store would offer a complete database like you suggest, along with community comments and ratings per product. In addition, I wish there was a way to just barcode scan in your own print products to do fast inventories.

I also think we’re in the early days of a business model revolution. I suppose that’s just repeating what the pundits are saying. But, if we actually are moving into an information age, then information is the valued thing. In the future, I don’t see business succeeding by siloing information into dead trees. The information economy will require complete access to all information, which means proper search. This is already happening with Google’s semantic search algorithms, for example.
Yax Says:
January 28th, 2009 at 6:12 pm
One thing I like about Google is that it values older content. So if you’re going to get blog results they are more likely to be from an an established source.

One thing I don’t like about Google is that it values older content. Things change so fast. Some older content is just out-of-date. Everyone finds it but nobody wants to read it.

Yax’s last blog post..Retro Traps: 5 Evil Scenarios from the Bad Old Days
Masteh Casteh Says:
February 3rd, 2009 at 2:47 am
Yax, you speak the truth.

The only experiance I have had with index systems arer the ones in the back of the books and the ones that are in the public library systems. None of them fit my liking, both are long, ardous and many timnes will not eben get you what you want.

It would be nice to have a system is based on a program that uses everything about the key word to bring you info… it should then give you a small amount of info and a summary on thte resource allowing you to disregard irrelevant sources.