Mon
14
Feb '05
|
|
I have written my share of XML documents, I know more or less how to write a schema for these documents. But every time I write a new document I spend time pondering over the simple question whether something should become an attribute or a fully-fledged element.
The Problem: “Should a piece of information become an element or an attribute?”
I think I’m not alone with this problem. As Mark Johnson put’s it in “Attributes Versus Elements: The Never-ending Choice”
Few topics re-occur more frequently, wherever XML developers congregate, than the attributes versus elements debate. The more experience you have of developing XML systems, the murkier the waters surrounding this question. The innocent sounding question can, and does, spark off debates that touch everything from pragmatism to epistemology to mereology and back again.
Marc Johnson offers no silver bullet in his article and further googling leads to a lot of articles presenting XML Schema Design patterns but nothing regarding my question.
google.com: Results 1 – 10 of about 198,000 for xml “design pattern”. (0.12 seconds)
A site with the promising name xmlpatterns.com presents some nice patterns but on this low-level question it has very little to offer. The only page which contains at least some help is Good XML Structural Design
…
Document SizeIt is usually better to have short documents. Longer documents take up more disk space, take longer to process by machines and humans, and take longer to transmit across the network.
Ease of Authoring
Documents which are authored by people must be understandable to them. Extremely complex documents will cause authors to make mistakes, waste time, and get frustrated.
Ease of Processing
XML documents are ultimately processed by software at some point, and the document structure effects how difficult the processing software will be to write.
…
My lessons learned from this list:
- Corollary 1: Document Size
Use of attributes usually leads to smaller documents. Use as many attributes as possible. The difference between <attribute>value</attribute> vs @attribute="value" doesn’t look like much but it adds up in a large document. - Corollary 2: Ease of Authoring
Attributes are usually easier to type and smaller (see Corollary 1.) documents are easier to edit. If you have a good XML editor, you can ignore this point, you editor should make it easy for you to enter element or attribute data. If your content is longer than a couple of characters, contains quotation marks or newlines use an element. - Corollary 3: Ease of Processing
The machine couldn’t care less wheter something is an attribute or an element. Additionally, in every API I ever used, getting the attribute value was much easier and involved less code than getting the content of an element. So again, slight advantage for attributes
My rules
Ok, after a lot of pondering here is the first version of my XML Design Patterns. The order is random and there is no precendence between rules. Remember, this is work in progress and this page may change over time.
- Use as many attributes as possible
see corollaries 1-3. - If it will never ever appear more than once, make it an attribute
and the opposite, if something can appear any number of times, it has to become an element. - If something represents a seperate entity, make it an element
you might want to add attributes later. - If something needs to be referenced, make it an element.
with an id attribute, couldn’t do that with an attribute.
Final note from Barbossa “… the Code is more what you’d call guidelines than actual rules.”
Take what you find useful and let me know when you don’t agree with a point.
Update 27.11.06:
I found a nice article on the same topic from ibm.com/developerworks, called
Principles of XML design: When to use elements versus attributes.
Hi
Nice article.
I guess 2 things help us to answer this question (what I have gathered from various articles on net):
1. Is is the metadata or imp piece of information? – eg If we wish to give an ID to an entity to distinguish it from other entities of the same type, then it should be an attribute, else element. An element is that piece of info which user himself is aware of.
2. If there is some element which for sure would not have some attribute defining it further in the future, convert it to an attribute. But weight it against point 1.
Thanks & regards
Software Pari
One issue with attributes is mentioned in the developerworks article: If the information should not be normalized for white space, use elements.