HTML Checker

This utility checks for indicators that a page has been coded in such a way that maintenance may be costly. Each of the basic checks is assigned a flag of Okay, Warning, or Danger, to indicate the relative impact of the results. At the end of the report, a cumulative estimate is also provided. Danger indicates the page will probably be costly to maintain, and may not be well-written.

* This is not a validator, and these results do not indicate that a page has been poorly coded. They should be used as a quick check to decide whether further evaluation is necessary.

Enter the URL (for example: http://domain.com) you would like checked.




$v) switch ($k) { case 'scheme': if ($v != 'http') die('Error'); $sSource=$v.'://'; break; case 'host': if (strlen($v) == 0) die('Error'); $sSource.=$v; break; case 'path': $sSource.=$v; break; } echo 'URL: '.$sSource.'
'; $fContents=@file_get_contents($sSource); if (strlen($fContents) == 0) die('Error'); $aCheckDeprecatedTags=array('applet','basefont','center','color','dir', 'font','isindex','menu','s','strike','u','xmp'); $aCheckLayout=array('table','tbody','td','tfoot','th','thead','tr'); $aCheckFontStyle=array('tt','i','b','big','small'); $aCheckDeprecatedAttributes=array('align','bgcolor','background','valign','halign'); $aCheckInlineStyle=array('style'); $iNewLines=substr_count($fContents,"\n"); echo '

Basic Checks

'; echo 'Lines: '.$iNewLines.' (estimated)
'; echo 'Bytes: '.strlen($fContents).' (estimated)
'; echo 'These numbers reflect only the HTML, not the CSS or any other supporting files including scripts and images.'; echo '

Checking for deprecated HTML tags (http://www.w3schools.com/tags/default.asp).

'; echo '

May indicate old code or code written by an inexperienced developer.

'; $i=0; foreach ($aCheckDeprecatedTags as $k => $v) { // echo $v.'
'; $i+=CheckTags($fContents,'deprecated',$v); } Report($i,5,10,15); $iTotalIssues=$i; echo '

Checking for table tags (http://www.w3schools.com/css/css_intro.asp).

'; echo '

Table-based page layouts can be difficult, unpleasant, and expensive to maintain.

'; $i=0; foreach ($aCheckLayout as $k => $v) { // echo $v.'
'; $i+=CheckTags($fContents,'table',$v); } Report($i,5,15,20); $iTotalIssues+=$i; echo '

Checking for font style tags (http://www.w3schools.com/tags/tag_font_style.asp).

'; echo '
'; echo '

Font style tags limit the text enhancements and are difficult to maintain.

'; $i=0; foreach ($aCheckFontStyle as $k => $v) { // echo $v.'
'; $i+=CheckTags($fContents,'font style',$v); } Report($i,2,5,10); $iTotalIssues+=$i; echo '

Checking for deprecated HTML attributes (http://www.w3schools.com/tags/tag_table.asp).

'; echo '
'; echo '

These can make page maintenance impossible or expensive. These are primarily related to tables, but may be applied to many other tags.

'; $i=0; foreach ($aCheckDeprecatedAttributes as $k => $v) { // echo $v.'
'; $i+=CheckAttributes($fContents,'deprecated',$v); } Report($i,5,8,10); $iTotalIssues+=$i; echo '

Checking for inline style attributes (http://www.w3schools.com/TAGS/att_standard_style.asp).

'; echo '
'; echo '

Inline style attributes may make the design difficult to maintain.

'; $i=0; foreach ($aCheckInlineStyle as $k => $v) { // echo $v.'
'; $i+=CheckAttributes($fContents,'inline style',$v); } Report($i,5,10,15); $iTotalIssues+=$i; echo '

Final score: '; Report($iTotalIssues,20,30,50); echo '

DOCTYPE and Validation Checks

Checking for DOCTYPE tag (http://www.w3schools.com/tags/tag_DOCTYPE.asp)

'; CheckTags($fContents,'DOCTYPE','!DOCTYPE'); echo '

Checking for a title tag.

'; CheckTags($fContents,'title','title'); echo '

Checking for references to w3c.org (http://www.w3.org/QA/Tools/). References to w3c.org are interpreted as indicators that the person who wrote the code is aware of these tools. However, validation does not guarantee quality.

'; echo '
'; CheckReferences($fContents,'W3C Organization','w3c.org'); echo '
'; echo '

Generator

'; echo '

Checking for generator indicator. An application such as a web-based blog, word processor or web development application may have been used to generate the page. This may represent a security risk by informing site visitors of the application and version.

'; echo '
'; $aMatches=array(); preg_match('//i',$fContents,$aMatches); if (isset($aMatches[1])) echo 'Generator: '.$aMatches[1].'
'; echo '
'; echo '

Coding Style

'; echo '

CSS can be used to control some or all of the design and layout in a page. The presence of div and span tags, as well as id and class attributes can indicate the page design and layout can be managed efficiently with CSS. Too many classes may indicate CSS inheritance issues, and if there are more span tags than div tags, the design may be awkward. Embedded style tags (not attributes), may be used to polish the page or compensate for browser issues. They may make maintenance more difficult if they are not carefully managed at the page level or with a template system. The number of image (img) tags is also checked. Too many images may make maintenance difficult and may make the page load slowly. This is a very subjective assessment.

'; $aCheckDivSpan=array('div','span'); $aCheckIdClass=array('id','class'); foreach ($aCheckDivSpan as $k => $v) { // echo $v.'
'; CheckTags($fContents,$v,$v); } echo '
'; foreach ($aCheckIdClass as $k => $v) { // echo $v.'
'; CheckAttributes($fContents,$v,$v); } echo '
'; CheckTags($fContents,'style','style'); CheckTags($fContents,'image','img'); echo '
'; echo '

Two attributes, alt and title (http://www.w3schools.com/tags/tag_img.asp), can be used to assist users. Their presence is often a good sign that the person who wrote the code pays attention to detail and strives to assist site visistors. These attributes may also be valuable for SEO.

'; echo '
'; $aCheckAltTitle=array('alt','title'); foreach ($aCheckAltTitle as $k => $v) { // echo $v.'
'; CheckAttributes($fContents,'',$v); } echo '
'; echo '
'; echo '

Additional Resources

'; echo '

Validation

Validators http://www.w3.org/QA/Tools/

'; echo '

Spelling

Check the spelling of a site at http://www.websiteoptimization.com/services/spell-check/

'; echo '

Page performance

Check the page performance at http://www.websiteoptimization.com/services/analyze/

'; echo '

Number of sites with the same IP address

http://www.yougetsignal.com/tools/web-sites-on-web-server

'; function CheckTags($fContents,$sType,$sString) { $aMatches=array(); $aResult=preg_match_all('/<'.$sString.'[\s|>]/i',$fContents,$aMatches); if (($iCount=count($aMatches[0])) > 0) echo 'Found '.$iCount.' '.$sType.' tags: '.$sString.'
'; return $iCount; } function CheckAttributes($fContents,$sType,$sString) { $aMatches=array(); $aResult=preg_match_all('/\s'.$sString.'\s*=/i',$fContents,$aMatches); if (($iCount=count($aMatches[0])) > 0) echo 'Found '.$iCount.' '.$sType.' attributes: '.$sString.'
'; return $iCount; } function CheckReferences($fContents,$sRef,$sString) { $aMatches=array(); $aResult=preg_match_all('/\s'.$sString.'\s*=/i',$fContents,$aMatches); if (($iCount=count($aMatches[0])) > 0) echo 'Found '.$iCount.' references to '.$sRef.': '.$sString.'
'; return $iCount; } function Report($i,$ok,$warning,$danger) { if ($i<=$ok) echo 'Okay'; else if ($i<=$warning) echo 'Warning'; else echo 'Danger'; } ?>

Copyright © 2008 Betsy A. Gamrat All Rights Reserved