2007年9月12日水曜日


Uniform Resource Locator (URL) is a technical, Web-related term used in two distinct meanings:

In popular usage, it is a widespread synonym for Uniform Resource Identifier (URI) — many popular and technical texts will use the term "URL" when referring to URI;
Strictly, the idea of a uniform syntax for global identifiers of network-retrievable documents was the core idea of the World Wide Web. In the early times, these identifiers were variously called "document names", "Web addresses" and "Uniform Resource Locators". These names were misleading, however, because not all identifiers were locators, and even for those that were, this was not their defining characteristic. Nevertheless, by the time the RFC 1630 formally defined the term "URI" as a generic term best suited to the concept, the term "URL" had gained widespread popularity, which has continued to this day. URI/URL syntax in brief
In its current strict technical meaning, a URL is a URI that, "in addition to identifying a resource, [provides] a means of locating the resource by describing its primary access mechanism (e.g., its network 'location')."

URL URLs as locators
"Clean" and "cruft-free" describe URLs which are:
An example of the difference between "clean" and "standard" URLs could be seen as:
Standard:
Clean:
or

Not tied to technical details, such as the software used or whether the resource comes from a file or a database - so that a change in the technology will not break existing links to the resource. e.g. /cars/audi/ is preferable to /cars/audi/index.php or /myprog.jsp?page=cars/audi/.
Not tied to internal organisational structure, such as the current editor or department that created the document - so an internal reorganisation will not cause existing links to the document to break. e.g. /recommendations/2007/xyz/ is better than /~users/jane/current-work/xyz/ or /xyz-team/recommendations/.
Consistent with other URLs in the same site in terms of hierarchy. This is desirable so a user can see where they are in the structure of the site, and can predict where to find what they are looking for. e.g. /cars/audi/ and /cars/ford/, instead of /cars/audi/ but /ford-cars/.
Consistent with other URLs in the same site in terms of action. This is desirable so a user can predict other, similar URLs on that site, e.g. if /blogs/andrea/feed/ shows a feed of Andrea's blog, then appending /feed/ to any another blog on the same site should show a feed for that blog.
A single location for a single resource. The same resource should not be available from multiple URLs, as this results in both confusion (Are they the same resource, or is one a copy of the other? Which is the 'right' one? Is one new and the other due to be removed?) and technical difficulties, e.g. counting links to a particular resource, or caching content to speed up access but not being able to show the cached content when the resource is accessed using a different URL. Clean URLs
Web services have been created that allow users to create short URLs which are easier to write down, remember or pass around. They are also more suitable for use where space is limited, for example in an IRC conversation, email signature, online forum or fixed width document (eg. email). A sample of current web services are provided below:

TinyURL.com - probably the most widely used due to its memorable name. Example: http://www.tinyurl.com/qvqqo
doiop.com - one of the early services which offers keywords as opposed to random URLs. Example: http://doiop.com/keyword
dtmurl.com
gu.ma
notlong.com - lets you choose your own sub-domain. Example: http://your-choice.notlong.com/
SnipURL.com (synonyms: snurl.com, snipr.com)
shorl.com
URLStrip.com Clean URLs with web services
Ultimately these services hide the final destination from a web user. This can be used to unwittingly send people to sites that offend their sensibilities, or crash or compromise their computer using browser vulnerabilities. To help combat such abuse, TinyURL allows a user to set a cookie-based preference such that TinyURL stops at the TinyURL website, giving a preview of the final link, when that user clicks TinyURLs. Substituting http://preview.tinyurl.com for http://tinyurl.com in the URL is another way of stopping at a preview of the final link before clicking through to it. Opaqueness is also leveraged by spammers, who can use such links in spam (mostly blog spam), bypassing URL blacklists.
Furthermore, this approach creates dependency on a third-party service that may change, go away, or maintain privacy-compromising logs of user activity indefinitely.