Xaraya | F. Besler |
Request for Comments: 0001 | Xaraya Development Group |
Category: Informational | January 2002 |
RFC-0001: Content Management System
This memo provides information for the Xaraya community. It does not specify an Xaraya standard of any kind. Distribution of this memo is unlimited.
Copyright © The Digital Development Foundation (2002). All Rights Reserved.
The contents of this RFC contain the literal content of the old plain text version of RFC-0001
When time is a less scarcer good, someone might convert the plain text into structured XML so we can benefit from it.
I read through the articles on http://www.postnuke.com, its forum threads, feature request on sourceforge, the xaraya developpers mailing list and reviewed some of the other open source web content management systems on the market. The following RFC is a summary and contains some solution proposals either compiled from the basic documents or from general definitions concerning content management systems. This is not a static document but work in progress
"Web content can be articles, pictures, products, email archives, Flash presentations, streaming audio, whatever. This content needs a lot of things done to it. You might need systems for creating the content (authoring), describing it (metadata tagging), changing and updating it (editing), letting several people edit it together (collaboration), letting the right people do the right things to it (workflow), stopping the wrong people from manipulating it (security), keeping track of how it has changed (versioning), deciding when to display it (scheduling), displaying it in the right standard format (templating), allowing it to be displayed by others (syndication), allowing it be displayed differently to different visitors (personalisation) and more." [12]
"... The content of an online learning community always includes questions and answers in a discussion forum. A programmer might start by building a table for discussion forum postings. ... Most online learning communities offer published articles that are distinguished from user-contributed questions. A programmer would therefore create a separate table to hold articles. Any wellcrafted site that publishes articles provides a facility for users to contribute comments on those articles. This will be another separate table.
Is a pattern emerging here? We distinguish a question in the discussion forum table because it is an item of content that is not a response to any other discussion forum posting. We distinguish articles from comments because an article is an item of content that is not a response to any other content item. Perhaps the representation of articles, comments on articles, questions, answers, etc. should be unified to the maximum extent possible. Each is a content item. Each has one or more authors. Each may optionally be a response to another content item. Here are some services that would be nice to centralize in a single content repository within the content database: ..." [17]
We now have to step back and think about our goal: Do we want to develop a true Web Content Management System? Or do we want to stay a Web Portal / Community Management System? For the following RFC I assume the first one because it is stated everywhere (e.g. the Xaraya logo, the Xaraya admin message, the description on HotScripts). If the project managers of Xaraya dont share my basic assumption, then I will adjust this RFC.
// Lets think of the tables contenttype and publicationtype to // contain the classes (structures) and content and publications to // contain the instances. // there are relation tables for the classes and instances, and there // are relation tables that combine instances with users, comments, // ratings, ... // In the following text whenever "timestamp" or "date" is mentioned, // this will mean an int(10) field holding the UTC unix timestamp.
// as the content repository // you may note that there is no field that indicates the // formating of textual content. this is intentional and my // thought is that we should store these content components using // html and to convert wiki and bbcode on creation / editing time. // i was considering xml as storage format but since our main // target medium is the web, storing in html gives cheapest // rendering while still able to separate content and layout using // only allowed html tags with css that the template system can // override. contentid // unique long integer value conttypeid // foreign key from table "Contenttype" status // (draft, review phase 1, publishable) version // revision number of the content lastedit // timestamp of when the current version was saved preid // id of the contents predecessor in the content life // cycle enables to rollback to an earlier version or // delete the oldest version of a certain content // (purge) language // self-explanatory editorsnote // message from the last editor for the next editor // in the workflow to give a quick note what has to // be done caption // depends on the content type. (title, link text, // alt text of images, name of a binary content) content // the text itself, i.e. a chapter of a document, an // article a sections content, an abstract, a title. // for binary content like worddocs, pdfs and images // the entry could either be a link to the document // in the file system or the document itself (with // the former being my favorite method.
Table "Publications" // this is the point of contact to the templating system // every publication uses a unique template from the template // repository. every publication has a status similar to the // status of a content component. // publication types define howmany and what content components // are allowed or required. pubid // publication id, primary key templateid // unique identifier for a template, requirement 1a pubtypeid // publication types are defined in a separate table. // they are equal to a type field. parentid // id of the parent content to be able to structure // the content into chapters, subchapters, pages... // the structure can be different for each publication // see requirement 4 status // (draft, review phase, published, archived) version // version of the current publication preid // id of the publications predessesor id in terms of // status (workflow) or version (-> versioning) language // self-explanatory. have the publishing tool making // sure automatically, that only content components // with corresponding languages are able to be tied // together editorsnote // message from the last editor for the next editor // in the workflow to give a quick note what has to // be done pubdate // date (timestamp) when the publication is // auto-published / was manually published expdate // timestamp when the publication will be expired / // auto-archived
Table "Content_Publication" // this table provides the ability to join 1 or more content // components to one or more publications. you can only choose // a content component whose status is "publishable" pubid // publication id, foreign key to table publications contentid // foreign key to content table has_child // boolean value to determine whether a content // component has a child component in the present // publication: for faster content structure parsing
Table "Contenttype" (ct) // most content types are predefined in a standard installation conttypeid // primary key conttypname // name of the content type, e.g. [abstract, fulltext, // link, picture, downloads, pdf, worddoc] MIMEtype // e.g. application/pdf or text/html
Table "Publicationtype" (pt) // there are predefined ones like article, news, ... and custom // pts. while templates are for display and formatting the // output, pts are for the structure of publications. pubtypeid // primary key for the pts pubtypename // name for the pt, e.g. article, document
Table "PT_CT" // this table defines the structure of a pt: which and howmany // content components are required / allowed. pubtypeid // compound key (1of2), foreign key to pt table conttypeid // ---- " ---- (2of2), ----- " ----- ct table itemsmin // minimum amount of the specific content component itemsmax // maximum ------------------ ------------------
Table "Content_Editors" // see requirement 5a, the version the user has edited is // reflected by the unique contentid contentid // compound key (1of2), foreign key to content table editorid // ---- " ---- (2of2), ----- " ----- users table
Table "Publications_Editors" // see requirement 5a, the version the user has edited is // reflected by the unique pubid editorid // compound key (1of2), foreign key to users table pubid // ---- " ---- (2of2), ----- " ----- the // publications table
Table "Publications_Promote" // see requirement 7 // the modules to display the publications will check this table // first, not all display modules will use this table. contentid // ck (1of2), foreign key to the publications table weight // ck (2of2), order in which the content will appear on // display promotexp // expiration date (timestamp) when the weight looses // its effect and the content drops into normal order
Everything in 4 relates to publisher-authored content [14]. But the strength of the nuke-alikes is that users can also contribute to the sites content. This is called user-authored content. So how can we handle this valuable source of website content? My proposal is to make a module that provides a combination of content editors (online editors, uploads) for a predefined set of content components and a predefined form of publication (-> publication templates).
So the user will find a "Submit content" that gives them the choice of publication form, i.e. article, faq, weblink, and maybe custom publication templates. This must hide the complexity of splitting up the submission into content components using simple forms that are composed by reusable predefined form components. The delivered system should also hide complexity to the admin by providing completely predefined forms.
Then the site admin can either let those publications automatically be published (shortcut in the workflow) or he can have them dropped into a workflow so that anyone responsible for publishing can review the user-authored content and decide to either publish it in the current form, or edit the content, or rearrange the content, or change the publication form from, say, an article submission, to, say, a forum entry if we can regard a top level forum entry to be equal to an article and the forum replies as equal to comments. Again, content can be published in more than one form and in more than one structure -> you can have an article and a forum entry that share one or more content components, while only one copy of each version of the component is stored in the database. [15] I hope i could show the flexibility of the proposed system
Categorization system // to be able to assign the content to more than 1 category // see requirement 3 [1] [6] [8] [13] The details on this issue will be part of the categorization RFC whose content is currently discussed on the xaraya-data list. We might want to be able to categorize the content components in the content repository so that it is easier to search in. We definately want to be able to assign the publications to one or more categories. We should be able to have an hierarchical structure of categories of unlimited depth. To cut down the size of a centralized relation table, every module should provide its own category_content relation table using the centralized categories and API.
Permissions system // this is essential for security, workflow and publishing. The details on this issue should be part of an RFC on the permissions system. It is obvious that the permissions system is strongly linked with access control to different versions and status of content components and publications. Permissions are essential to the realization of a content life cycle, workflow and publication and therefore should be automatically adjusted by the system. Permissions should be on a group and single user basis. To implement a workflow ability each content component and publication needs to have at least 2-3 permissions: "admin" for the user/user-group that has to edit / review the item in the current state of the workflow, "read" for all others (or those intended to view the content) and none for those who shouldnt be allowed to view the publication (even when the status of it is "published", the publication date is reached and the expiration date has not yet been reached. In order to not having one really big permissions table the core system should only provide the most basic permissions while each module that uses content from the content repository should provide its own permission table using the core APIs for handling the permissions for the content it displays and edits. -> decentralizing the permissions table like e.g. the lang files. credits for the modularization idea to: BlackV_Li from PostTEP
Comments system // this will allow a unified comment system to attach a set of // comments to each publication. see requirement 2a The details on this issue will be part of the comment system RFC which is currently written by Carl P. Corliss (rabbitt) and Gregor J. Rothfuss.
Rating / voting system // see requirement 6 The details on this issue should be part of a rating system RFC. Here are some thoughts for someone who wants to write it. We should be able to have a per user rating for each publication. Maybe we want only authorized users being able to vote, maybe we want also anonymous user votes. We might want to allow multiple votes or only 1 vote per user. A scale has to be defined: from 1 to 5, from 1 to 10, ... One interesting side effect of a separated rating system comes up: What about the ability to rate registered users by registered users ? A big community plus!
Multisites system The details on this issue should be discussed. We definately want to be able to assign the publications to one or more subsites using the built-in multisites module. We might want to redesign the implementation of the current multisites system and think of it as just another categorization.
Code that will need to be rewritten /backend.php /print.php /themes/ /modules/Avantgo /modules/Downloads /modules/News /modules/NS-Addstory /modules/NS-Admin_Messages //if included in the content table /modules/NS-Autolink /modules/NS-Blocks /modules/NS-Comments /modules/NS-Ephemerids //if included in the content table /modules/NS-Multisites /modules/NS-Quotes //if included in the content table /modules/Reviews /modules/Search /modules/Sections /modules/Submit_News /modules/Top_List /modules/Topics /modules/Web_Links //if included in the content table and some blocks
Tools that need to be created from scratch
We list features that were considered but rejected for this system below.
7 (2002-01-06) Added the idea of predefined input forms for hiding complexity. moved the parentid from table "Content_Publication" to table "Publication" 6 (2002-01-05) Added proposal for modularizing the permissions and categorizing. 5 (2002-01-05) Added "Author contact" Added "Retractions" credits to: Carl P. Corliss (rabbitt) and Gregor J. Rothfuss. 4 (2002-01-04) Added the quotation [17] Added the tools that need to be created. Added the publication type, content type tables. Added the user-authored content Added the permissions system in the "Relationship ..." section 3 (2002-01-03) Reorganized the requirements. Moved the tables concerning comments, ratings, categories and multisites to the new "Relationship ..." section Added the "Relationship to other areas" credits to: Carl P. Corliss (rabbitt) and Gregor J. Rothfuss. Dropped the container idea in the content table in favour of the notion of having "Publications" introduced in version 2 Added the "Changelog" 2 (2002-01-03) Reorganized the content repository Introduced the "Publications" table and its implications 1 (2002-01-02) Initial version
[1] | newbienetwork and spannah, “http://www.postnuke.com/modules.php?op=modload&name=News&file=article&sid=1354”, unknown. |
[2] | Yagi, “ http://www.postnuke.com/modules.php?op=modload&name=News&file=article&sid=1376”, unknown. |
[3] | alarion, “ http://www.postnuke.com/modules.php?op=modload&name=Sections&file=index&req=viewarticle&artid=9”, unknown. |
[4] | jerryj, toph and Landi, “ http://www.postnuke.com/modules.php?op=modload&name=News&file=article&sid=935”, unknown. |
[5] | nekvasilt, malexandria, ED and alarion, “ http://www.postnuke.com/modules.php?op=modload&name=News&file=article&sid=562”, unknown. |
[6] | Florian Bruckner, “ http://groups.yahoo.com/group/pndev/message/295”, unknown. |
[7] | niceguyeddie, alarion, jimbeam, gregor, bradnickel, ranrinc (KR) and spliffster, “http://sourceforge.net/tracker/index.php?func=detail&aid=438855&group_id=27927&atid=39223”, unknown. |
[8] | Asparagirl, “ http://sourceforge.net/tracker/index.php?func=detail&aid=441077&group_id=27927&atid=39223”, unknown. |
[9] | bam-bam, “ http://sourceforge.net/tracker/index.php?func=detail&aid=459239&group_id=27927&atid=392231”, unknown. |
[10] | simmayor, “http://sourceforge.net/tracker/index.php?func=detail&aid=465973&group_id=27927&atid=392231”, unknown. |
[11] | anonymous, Tobias, Andy, mulpinsf and duup, “http://sourceforge.net/tracker/index.php?func=detail&aid=473841&group_id=27927&atid=392231”, unknown. |
[12] | David Walker, “http://www.shorewalker.com/pages/cms_woes-1.html”, unkwown. |
[13] | ncwbiz, clnelson and jnlewis, “http://sourceforge.net/tracker/index.php?func=detail&aid=439872&group_id=27927&atid=392231”, unknown. |
[14] | Philip Greenspun, “ http://philip.greenspun.com/internet-application-workbook/planning”, unknown. |
[15] | nickrazer, JM (Jun), “http://www.postnuke.com/modules.php?op=modload&name=News&file=article&sid=1471”, unknown. |
The DDF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the DDF's procedures with respect to rights in standards-track and standards-related documentation can be found in RFC-0.
The DDF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the DDF Board of Directors.
Funding for the RFC Editor function is provided by the DDF