The newest specs for HTML forms give programmers more control over data input and validation, while offloading much of the work to the browsers
The changes and enhancements to the form tags are some of the most extensive amendments to the HTML5 standard, offering a wide variety of options that once required add-on libraries and a fair amount of tweaking. All of the hard work that went into building self-checking widgets and the libraries that ensure the data is of the correct format is now being poured into the browser itself. The libraries won't be necessary -- in theory -- because the work will be done seamlessly by all browsers that follow the standard. In practice, we'll probably continue to use small libraries that smooth over slight inconsistencies.
The new HTML specifications include input types that offer a number of new options for requesting just the right amount of data -- say, a form element that requests the time in different levels of granularity, such as month, week, or minute. Other new input types insist that the user type in only valid URLs or email addresses. All of these input fields will be tested to ensure that the text in them is valid and that the user's progress toward satisfying the data integrity police will be tracked by a series of events. There are even hooks for a value sanitization algorithm that checks the information and perhaps cleans it up with some AJAX.
Compliance with these options is gradually appearing in the browsers. At the time of this writing, for instance, Chrome lets you pin the
max for some dates, but you can't install a value sanitization function. The minimum and maximum values are, of course, the simplest controls to create. It's much harder to offer the deeper hooks.
Holes like this are sprinkled throughout the new options. Firefox, Safari, Opera, and Internet Explorer are all slowly rolling out the new form features, and they're pretty much done with the most important ones. Alas, not all of them support the new features in exactly the same way, so it's still a bit complicated to create content that uses them. But as these gaps close, the new form elements will make it much easier for Web developers to gather information and enforce a few rules that keep the users in line.
To find out if your browser supports the new input data types and controls, try my experimental HTML5 table at wayner.org.
HTML5 forms: Input element type
The new options take on some of these chores. The compliant browser will now make a distinction between a wide range of data types, including dates, email addresses, numbers, and URLs. Each of these types has several more specific options. The date field may ask for a full date, a year and week alone, a year and month alone, or just the time of day. If you want to be very specific, you can mix together a date and time with the option of including or leaving off a time zone.
Some of these types seem like invitations to trouble. I'm happy I'm not responsible for implementing the code that will validate all of the different kinds of telephone numbers around the world. In America, it's a hassle because some folks will punctuate the number in odd ways, like wrapping the area code in parentheses. Freezing these rules in the browser standard will be problematic if the phone companies dream up new ways of using the numbers. Of course, if that day arrives we can always override the validation because there are attributes that allow specifying
HTML5 forms: Input element type attributes
Choosing the type is just the beginning of the fun when creating these new form elements. Each type may or may not have additional features that can be specified with additional attributes. Many of these attributes are straightforward. For example,
max can only be used with times and numbers, and not with unlikely items like email addresses, even though they're technically sortable.
By my quick count, there are 37 attributes and 14 different types. The current version of the HTML5 input element specs includes a table that shows which attributes are allowed (limiting the
max value of a number, for example) and which are ignored (limiting the
max value of an email address) for which types. I'm still a bit confused by why you can only specify a
placeholder for some types. This short suggestion (for example, "your email address") isn't available for times or colors. Most of the other pairs that are allowed or forbidden are easy to understand, but I think most will find one or two combinations that they wish were there.
The new mechanisms are meant to extend the status quo, and that means not changing some of the old patterns. To me, it might make sense to allow each type of input to be hidden with an attribute, but the new standard continues the old approach of making "hidden" a type that accepts generic text. That's the price of backward compatibility.
HTML5 forms: Client-side form validation
Specifying the type and attribute are just the beginning because the validation process is fairly transparent. While the form will handle most of the work for you, it will also allow a number of hooks for interrupting the process or replacing it.
When something seems incorrect, the validation will set up a data structure that can be queried. The method
validity.patternMismatch, for instance, will return true if a pattern is specified but the data doesn't fit it.
If you want to specify your own validation, you can add a custom message indicating why the data might not be acceptable. You can fire off this routine with an
Problematic input data can also trigger events of their own that you can trap. Data checks can be set off by hitting the
It's all pretty flexible and built in a way that will be familiar to everyone used to the traditional mechanism of attaching functions that listen for particular events. There are probably three or four different ways to check each form field.
The standard also includes a good reminder that the clients can't be trusted to enforce these rules. Although testing the data locally will save time and energy, it won't be a perfect solution because older browsers may not implement the validity checks. It's also possible that clever users may override some of the methods and block checking. For this reason, any serious data validation rules must be re-evaluated at the server. The browser can't be trusted.
HTML5 forms: Customizable options
Simply validating the data as acceptable or not acceptable is not the only option anymore. HTML5 includes several attributes that let you offer help and suggestions to the visitor.
The simplest option lets you turn on spell-check for any input element that's marked as editable. This will normally apply to form elements like
textarea but may also include any part of the document that's marked
contenteditable. (Editable content is discussed below.) The attribute
spellcheck='true' determines when it applies.
I'm guessing that the
spellcheck attribute also toggles the grammar checker, but it's not immediately apparent to me. The title of the section of the spec is "Spelling and grammar checking," but the text only mentions one attribute called
spellcheck. If I were designing the spec, I would make them independent, if only because I've found that one feature is much more accurate than the other.
datalist element lets you add a list of strings that can automatically complete a form element. The structure is like the
option tags used in
select elements. At this point, only Opera seems to support the feature, and some feel it makes the HTML that much grungier by larding it up with suggested answers. I'm also a bit annoyed by the idea that each potential option comes with a label that is displayed and a value that actually fills up the form element. It seems like a dangerous way to hide functionality from the user and perhaps trick them into thinking that one thing is going in the form (the label), while filling it with another (the value).
I was also confused by the possibility of having an external list of data options stored in an XML file independent of the current HTML form. This would not only simplify the HTML but also make the data reusable in different pages. It seems like a good idea, but the spec doesn't mention it yet. I've found only secondary references to this option.
HTML5 forms: Authentication
One of the most tempting options brings authentication or certification to the form information, but it is still rather unformed and not very well implemented. The so-called
keygen element adds some form of cryptography using public-key encryption, but it is only partially implemented on Chrome, Firefox, and Opera, despite dating from the time of Netscape. The potential power is huge, but I think it will take several more iterations to find a good set of features that work the way that people expect.
The idea is to get the browser to offer a way to generate pairs of public and private keys automatically. Many programmers who've tried to use
keygen say it's confusing for the average person because it requires too much understanding of such details as the length of keys. There are also deeper issues about how users might move the certificates from computer to computer or how malware might target them.
In the future, the option might include a better way to automatically use a key pair to sign all data in the form, not just the challenge attribute attached to the
keygen item. This, of course, requires a more standard mechanism for creating the signature over all possible forms of data. The standard hash functions and message digests are probably a good place to begin. This will have to wait until the feature is more fully formed.
HTML5 drag and drop
The ability to drag HTML elements around and drop them somewhere else is an old option for Web designers who are willing to use their own libraries, but it's always been mired in some confusion. After Microsoft included drag-and-drop support in what was called DHTML in 1999, developers had to struggle with cross-browser problems. A number of good cross-browser scripts appeared over the years, and many sites use them, even though they seem to confuse the public, who tend to expect the items on Web pages to be somewhat fixed in place. I've often expected companies like Netflix to implement drag and drop to maintain lists, but they never seem to choose that path.
In any case, the HTML5 drag-and-drop spec smoothes away many of the browser differences. In theory, the cross-browser scripts won't be necessary as long as all browsers follow the standard in exactly the same way. All that you need to do is add the attribute
draggable='true' and the element can be picked up and moved.
Well, that's not quite all. If you want to do something with the dragged element, you must be able to handle at least seven different events that fire as it moves around the page. Struggling to deal with all possible options has driven some people to write long complaints about the complexity. (A "disaster" and "far from complete" are two early gripes.)
There are also some compatibility issues. Safari, for instance, requires a separate CSS entry to turn on dragging even after you add the
draggable='true' attribute. All of these issues point to the fact that someone is going to write a simpler drag-and-drop library that abstracts away much of this complexity and makes it as easy as adding the
HTML5 forms: Self-calculating form fields
onchange event fires.
The new idea is to create a new
output element that will work in concert with the
input element. An attribute specifies the formula for the
output field. The browser is responsible for updating the
output field whenever the form changes by calculating the formula. I have tried to use this on several browsers without success. It just seems easier to use good old input fields instead.
The output can also be represented graphically using the
meter tags. Both essentially represent some fraction between zero and one as a thermometer-like rectangle that fills up with color -- but there are differences. The
progress element has an "indeterminate" setting that indicates the software has no clue what the value really is. This is usually displayed as wavy lines.
meter tag is a bit more interesting because it includes the opportunity to specify a
high attribute, as well as a
max. Presumably a value between
max is undesirable and the user should try to push whatever buttons are necessary to push this value between
HTML5 forms and editable content
All of the work that's been done on the forms is quite nice, but the irony is that the
form tag itself is sort of passe. The greatest change in the
form tag may be the fact that it's no longer necessary. Now most HTML elements can be edited by simply adding the attribute
contenteditable="true" to any old
span. It's pretty freaky. If the user doesn't like what you write on your blog, you can give them the opportunity to rewrite it to fit their preconceived notions. In essence, any table or pile of data could be turned into a form just waiting for the user to click and change. Everything can be wikified.
This section of the API is changing a bit. Just recently, the
getSelection method was moved, changing the best way to capture any of the editing process. Is
getSelection ideal? Nope. Is it dangerous? Perhaps in the wrong hands. Will it be confusing to old users who still think that they can only monkey with data in the form boxes? Certainly. Will it encourage more traffic when people save entire pages just to store one tweak? No doubt. But editable content opens up more possibilities than ever. I'm sure that creative Web designers will find clever ways to make everything a form that comes alive.
Note: This is the fourth article in a series devoted to the new features of HTML5. The first article, "HTML5 in the browser: Canvas, video, audio, and graphics," examined display options, including the <canvas> and <video> tags, Scalable Vector Graphics, and WebGL. The second article, "HTML5 in the browser: Local data storage," examined Web Storage, Web Database, and other APIs designed to transform Web pages into local applications. The third article, "HTML5 in the browser: HTML5 data communications," examined cross-document messaging, WebSockets, and other HTML5 APIs that improve website and browser interactivity.
This story, "HTML5 in the browser: HTML5 forms," was originally published at InfoWorld.com. Follow the latest news in application development and HTML5 at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.
Windows 7 is suddenly telling users it isn't genuine -- and it has nothing to do with Windows being...
Windows users are reporting significant problems with four more October Black Tuesday patches
The larger design is very welcome, but there's much more to the iPhone 6 than a bigger screen
Sponsored by Rackspace
Sponsored by Nuage Networks
Sponsored by Fibre Channel Industry Association
InfoWorld picks the best hardware, software, development tools, and cloud services of the year
Microsoft CEO Satya Nadella is showing the same kind of leadership that Steve Jobs used to rescue Apple...
If you’re doing one or more of these things, it might be time to step away from the IDE and take a...
Black Duck presents its Open Source Rookies of the Year -- the 10 most exciting, active new projects...