Difference between HTML and XHTML

We all get to see websites; from colorful to simple, from static to dynamic, from informative to interactive, but do we really understand the kind of language that is being used to make the WebPages of any Website come alive. Well, the front end languages used for the construction of web pages are:  HTML (Hyper Text Markup Language) and XHTML (eXtensible Hypertext Markup Language). Here in this article we will have a look on both of the scripting languages and find out the possible difference between HTML and XHTML.


The Hyper Text Markup Language (HTML) was introduced, under the rules of Standard Generalized Markup Language (SGML) in document type definition (DTD), by Tim Berners-Lee in the year 1991. The SGML is a programming language through which markup languages are developed and its HTML application is a markup of text which uses simple set of tags for the data presentation on WWW (World Wide Web). The term ‘Hyper’ in HTML refers to ‘active’ elements. Elements can be paragraph, heading, tables, rows etc. and is defined under the ‘start’ and ‘end’ tags. The tag used on the HTML is defined as a markup that demarcates its element and has attributes attached to it. Attributes are the values to the tag for the better display of the data. With HTML, the WebPages of any website can be created. With the help of HTML, website designers can embed images, maintain fonts, construct layouts, and set hyperlinks for navigation on their websites.


The XHTML (Extensible Hypertext Markup Language) is a combination of HTML and XML (eXtensible Markup Language).  The XML is mainly used to describe the data and XHTML is an application of XML unlike HTML which is the application of SGML. By using XML, the HTML can be simplified further. There is no exclusion of any tags or attribute minimization however it does demand orderliness.


The HTML with SGML

HTML codeAccording to SGML standards, a document consists of hierarchical structure of nested elements.  The structure of these documents is defined under document type definition. The SGML element statement, like HTML, has both ‘start’ and ‘end’ tags. The SGML statements are set in angle brackets<> and has keyword/name and parameters which is followed with a space.


For example the element statement in SGML looks like this:

<! ELEMENT P – O (%text)*>


In the above example, the P (which means ‘Paragraph’) is the element name followed by the minimization indicators stating whether the start and the end tags can be omitted for this particular element.   The hypen ‘-‘indicates that the tag is required and ‘O’ means the tag is omissible.  The above statement indicates that the ‘P’ element must be carried by <p> start tag, but the </p> end tag can be excluded.


<! Element HTML O O (%html.content)>


The above example states that both <HTML> and </HTML> tags around the content of the HTML document can be omitted.


The syntax of HTML looks similar to that of SGML.  Most of the elements that the HTML uses are from the SGML document. Since SGML is a foundation in which HTML is build, certain tags can be dropped and its attributes can be minimized. But one has to be well versed with the SGML in order to reduce the tags in HTML.



The XHTML is an amalgamation of HTML and XML (sub set of SGML). The XML also has its own entity besides adopting SGML standards in its ‘markup’ and is quite ‘eXtensible’. The XML sets its own standards to store data and omits all the complex options of SGML. XML allows one to create own tags

besides adheres to  closing of all the open end tags.  It focuses on a proper structure with respect to elements and its declaration. All the documents that are XHTML based would have DTD (document Type Declaration (DTD) on its opening lines, which are as follows:


 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">


This type of declaration would ensure that you follow strict line of codes set in XHML and the browsers too would interpret the given declaration and display the webpages accordingly.


 In case you choose to switch between HTML and XHTML then your document declaration would be:


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


With the emergence of different devices in the market and their accessibility option to internet makes XHTML a better markup lanuage for browser to interpret and display.

The difference between HTML and XHTML in Tabular Form
An application of SGML An application of xml
Can have empty/open tagse.g. <br>, <p> All the unclosed tags must be closede.g. <br/>, <p></p>
No hard rule on structures of the elementse.g. <p><b>The difference</p></b> Structure of the elements should be followede.g. <p><b>The difference</b></p>
Attributes have quotes as optionale.g. <font color=#ff0000> Attributes have quotes mandatorye.g. <font color=”#ff0000″>
Attributes values not significante.g. <input type=”checkbox” checked> Attributes values are importante.g. <input type=”checkbox” checked = “checked”>
Case insensitive:The tags and attributes can be of uppercase or lowercase as per the preference Case sensitive:The tags and attributes must be of lowercase
All the content can be put under body element All the content has to  be put in blocks (p, under body element

  • Ankit Mehta.

    Thanks for your information. It has helped a lot.

    • Jaya

      Thanks very much for your feedback!

  • Charlie Solano

    very well said.XHTML is “the modern version of HTML 4”. the differences you mentioned are clearly understandable.

    Mylife Refund