Topical
Preliminary draft, 17oct2004 +chris+
ChangeLog:
- 17oct2004 +chris+: Initial revision
Introduction
Topical is a lightweight way to mark occurrences of topics in documents in a simple and straightforward way.
This document will mainly discuss and describe the Topical core. Information about how to add Topical meta-data to documents can be found in the appendices.
The Basics of Topical
Topical is based on a very simple, text-based syntax that resembles Unix path syntax. Each Topical description is based on the following syntax
description ::= topic [";" topic]*
Which is essentially a semicolon-separated list of topics.
A topic is a list of names that is separated by separators:
topic ::= separator? name [separator name]* description?
separator ::= "/" | "//"
name ::= ["a".."z" | "A".."Z" | "0".."9" | "-" | "_"]+
description ::= " "+ "(" text* ")"
text ::= " ".."'" | "*".."~"
Valid topics could be, for example:
/programming//ruby
/programming/languages/scripting/ruby (The Ruby programming language)
object-oriented/ruby
ruby
/things-to-do//diving (Diving)
These have this semantical meaning:
/programming//ruby: The marked data is about all ruby subtopics
of the topic programming.
/programming/languages/scripting/ruby (The Ruby programming
language): The marked data is about the ruby subtopic (with the
description "The Ruby programming language" of the subtopic
scripting of the subtopic languages of the topic programming.
object-oriented/ruby: The marked data is about the subtopic ruby
of any (sub)topic object-oriented.
ruby: The marked data is about all (sub)topics named ruby.
/things-to-do//diving (Diving): The marked data is about any subtopic
diving of the topic things-to-do, and labeled "Diving".
In Topical, there is no need for a central specification of topics; all descriptions will be merged into a topic-tree (see section "Merging"). However, it is recommended to implement support for a description file.
Description files
Description files enable users to predefine a certain taxonomy. A description file is structured like this:
description ::= [topic "\n"]*
Which is a file with one topic per line.
The topics of a description file do not specify any occurrences of topics, they just declare that certain topics exist. For example, to create a taxonomy of programming-languages, you may want to use a description file like this:
/programming (Programming)
/programming/languages (Programming languages)
/programming/languages/scripting (Scripting languages)
/programming/languages/compiled (Compiled languages)
/programming/languages/object-oriented (OO languages)
/programming/languages/scripting/ruby (The Ruby programming language)
/programming/languages/object-oriented/ruby (The Ruby programming language)
/programming/languages/scripting/perl (The Perl programming language)
/programming/languages/compiled/c (The C programming language)
Note that to a Topical implementation,
/programming/languages/scripting/ruby and
/programming/languages/object-oriented/ruby are not the same!
However, /programming/languages//ruby can (and should) be used to
specify that the section is about both topics.
Merging
Merging is the name of the process that takes several topics and builds a topic tree of them. In the case of merging all topics of above description file, you would get this topic tree:
programming (Programming)
languages (Programming languages)
compiled (Compiled languages)
c (The C programming language)
object-oriented (OO languages)
ruby (The Ruby programming language)
scripting (Scripting languages)
ruby (The Ruby programming language)
perl (The Perl programming language)
Merging gets harder if we add a few topics: Let's say the book "Programming Ruby" is about these topics:
programming//ruby
/book//ruby (Ruby books)
/book/publisher//addison-wesley
Now, we have this topic tree:
programming (Programming)
languages (Programming languages)
compiled (Compiled languages)
c (The C programming language)
object-oriented (OO languages)
ruby (The Ruby programming language)
* Programming Ruby
scripting (Scripting languages)
ruby (The Ruby programming language)
* Programming Ruby
perl (The Perl programming language)
book
ruby
* Programming Ruby
publisher
addison-wesley
* Programming Ruby
The tree would look quite different if we had these topics additionally in the description file:
/book (Books)
/book/programming (Programming Books)
/book/programming/ruby (Ruby books)
/book/publisher (Book Publishers)
/programming/languages/favorite (Favorite programming languages)
/programming/languages/favorite/ruby (The Ruby programming language)
The topic tree would now be:
programming (Programming)
languages (Programming languages)
compiled (Compiled languages)
c (The C programming language)
favorite (Favorite programming languages)
ruby (The Ruby programming language)
* Programming Ruby
object-oriented (OO languages)
ruby (The Ruby programming language)
* Programming Ruby
scripting (Scripting languages)
ruby (The Ruby programming language)
* Programming Ruby
perl (The Perl programming language)
book (Books)
programming (Programming Books)
ruby (Ruby books)
* Programming Ruby
publisher (Book Publishers)
addison-wesley
* Programming Ruby
[FIXME: describe (and implement ;-)) the exact merging algorithm: first add all exact topics, then insert the inexact ones (sorted by?)]
Appendix: Adding Topical descriptions to documents
E-Mails
The recommended way to mark RFC2822 messages is by using an unofficial field, for example:
X-Topical: /programming//ruby
X-Topical: /author//chneukirchen (Christian Neukirchen)
(X)HTML
In (X)HTML, you have two options of marking: document-wide, or by section.
For documents, you simply can use a <meta> tag in the header:
<head>
...
<meta name="topical" content="/programming//ruby" />
<meta name="topical" content="/author//chneukirchen" />
</head>
[FIXME: define a profile?]
Alternatively, you can use the class= attribute to only mark
sections of the document:
<p class="topical /programming//ruby;/author//chneukirchen">
...
</p>
It is recommended to prefix the description with topical to make it
easier to find it.
[FIXME: define alternative syntax for better CSS matching?]